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PREFACE 


This volume is intended to provide an introductory treatment of the 
fundamental processes of numerical analysis which is compatible with the 
expansion of the field brought about by the development of the modern 
high-speed calculating devices, but which also takes into account the fact 
that very substantial amounts of computation will continue to be effected 
by desk calculators (and by hand or slide rule), and that familiarity with 
computation on a desk calculator is a desirable preliminary to large-scale 
computation in any case. 

Whereas the availability of large-scale rapid calculators has made 
feasible the numerical solution of many problems of previously prohibitive 
complexity, the effective use of such devices depends strongly upon con- 
tinued advancement of research in relevant fields of mathematical 
analysis. In addition, there is a continuing need for personnel sufficiently 
well grounded in both the principles and the existent techniques of 
numerical analysis to be capable, not only of adapting available methods 
to a given machine, but also of comprehending the inherent limitations of 
each method and of devising error controls and modified techniques which 
are particularly appropriate to that machine. 

The urgency of the initial need for such personnel required a training 
which led as rapidly as possible to a fair working knowledge of techniques 
associated with some of the more sophisticated aspects of the field, and 
in which only a limited amount of time was available for the treatment of 
the fundamental elementary processes upon which those techniques are 
based. However, it would appear that the long-range needs will demand 
a somewhat more comprehensive college teaching program. 

The present text is based on the premise that the introductory course 
should provide a fairly substantial grounding in the basic operations of 
computation, approximation, interpolation, numerical differentiation and 
integration, and the numerical solution of equations, as well as in applica- 
tions to such processes as the smoothing of data, the numerical summation 
of series, and the numerical solution of ordinary differential equations. 
It is believed that this course not only should exhibit techniques available 
for each purpose, but also should attempt to derive the relevant formulas 
in such a way that the underlying hypotheses are in evidence and that 
methods of generalization and modification are reasonably apparent, and 
that the problems of error analysis, convergence, and stability should be 
treated as adequately as time and preparation permit. Furthermore, 


Vv 


vi PREFACE 


the course desirably should be accompanied by a problem laboratory, in 
which enough actual computation is effected (presumably by use of desk 
calculators) to establish the practical significance of the theoretical 
developments. 

Such an introduction should afford preparation for an ‘advanced 
course,”’ dealing with certain of the somewhat more sophisticated aspects 
of the solution of equations and with modern methods of matrix inversion 
and determination of characteristic values of matrices, together with the 
numerical solution of partial differential equations and of integral equa- 
tions, and for a course dealing specifically with the principles of large- 
scale calculating devices and with the various processes of organizing 
computations (programming) for them. 

This book is a revised version of a set of mimeographed notes originally 
prepared for classroom use in an introductory course offered at the Mas- 
sachusetts Institute of Technology. It is hoped that sufficient detail is 
included to permit an instructor to provide a survey of a substantial por- 
tion of the text in a single semester, when this is desirable, by delegating 
liberal amounts of the material to home reading and restricting classroom 
treatments to the more important or more difficult points, or, preferably, 
to provide a more thorough coverage in two semesters. Some of the more 
involved error analyses, such as those of §§5.11, 5.12, 6.8, 6.11, and 6.12, 
can be omitted in a first course without materially interrupting the con- 
tinuity of the presentation. 

The chapter treating the numerical solution of equations is independent 
of the other chapters and is placed at the end of the text, so that relevant 
portions of its content can be inserted when they are needed in other 
developments, at the discretion of the instructor. Thus, for example, 
some information relative to the practical solution of sets of linear alge- 
braic equations should precede the consideration of least-squares methods 
in Chapter 7. Alternatively, it may be desirable to introduce part or all 
of Chapter 10 immediately following the introductory Chapter 1. 

In order to facilitate the use of the volume for reference purposes, a 
directory of techniques is included in Appendix C. 

The presentations were influenced particularly by the earlier texts of 
Steffensen and of Whittaker and Robinson, and by the more recent texts 
of Milne. The author is indebted also to a number of colleagues and 
students for suggestions and criticisms; to the Office of Naval Research 
for the stimulus afforded by a sponsored project on machine methods of 
computation and numerical analysis at the Massachusetts Institute of 
Technology, under the direction of Professor P. M. Morse; and to Mrs. 
Mary Kuntavanish, Mrs. Dorothy Harvey, and Miss Ruth Goodwin for 
assistance in the preparation of the manuscript. 

F, B. Hirpepranp 
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CHAPTER 1 


INTRODUCTION 


1.1. Numerical Analysis. The ultimate aim of the field of numerical 
analysis is to provide convenient methods for obtaining useful solutions 
to mathematical problems and for extracting useful information from 
available solutions which are not expressed in tractable forms. Such 
problems may each be formulated in terms of an algebraic or transcen- 
dental equation, an ordinary or partial differential equation, or an integral 
equation, or in terms of a set of such equations. 

This formulation may correspond exactly to the situation which it is 
intended to describe; more often, it will not. Analytical solutions, when 
available, may be precise in themselves, but may be of unacceptable 
form because of the fact that they are not amenable to direct interpreta- 
tion in numerical terms, in which case the numerical analyst may attempt 
to devise a method for effecting that interpretation in a satisfactory 
way, or he may prefer to base his analysis instead upon the original 
formulation. 

More frequently, there is no known method of obtaining the solution 
in a precise form, convenient or otherwise. In such a case, it is necessary 
vither to attempt to approximate the problem satisfactorily by one which 
is amenable to precise analysis, to obtain an approximate solution to the 
original problem by methods of numerical analysis, or to combine the two 
approaches. 

On the other hand, the problem itself may not be clearly defined, and 
the analyst may be provided only with its partial solution, perhaps in the 
form of a table of approximate data, together with a certain amount of 
information with regard to its reliability, or perhaps in the form of an 
integral defining a function which cannot be expressed in terms of a finite 
number of tabulated functions. His purpose is then to obtain additional 
useful information concerning the function so described. 

Generally the numerical analyst does not strive for exactness. Instead, 
he attempts to devise a method which will yield an approximation differ- 
ing from exactness by less than a specified tolerance, or by an amount 
which has less than a specified probability of exceeding that tolerance. 
When the information supplied to him is inexact, he attempts both to 
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obtain a dependable measure of the uncertainty which results from that 
inexactness and also to obtain an approximation which possesses a spec- 
ified reliability compatible with that uncertainty. 

He tries to devise a procedure which would be capable of affording an 
arbitrarily high degree of accuracy, in a wide class of situations, if the 
reliability of given information and of available calculating devices were 
correspondingly high. Even when successful in this attempt, he still 
seeks alternative procedures which may possess certain advantages in 
convenience, in certain situations, but which may be of less general appli- 
cability, or which may have the property that the degree of accuracy 
obtainable, even under ideal circumstances, cannot exceed a certain limit 
which depends upon the function to be analyzed. In this last case, 
which is of frequent occurrence, he attempts to ascertain that limit and 
to classify the situations in which it is not sufficiently high. 

Needless to say, there are relatively few situations in which all these 
objectives have been, or can be, perfectly attained, as will be illustrated 
in the sequel. However, research with these aims in view continues to 
provide new procedures, as well as additional information with regard 
to the basic advantages and disadvantages of the older ones. Additional 
impetus has been afforded by the comparatively recent development of 
automatic desk calculators and of large-scale computing machines. For 
example, certain methods have long been known to possess important 
theoretical advantages, but have not been convenient, from the point of 
view of the labor and time involved, for use in hand calculation or in 
calculation based on the use of the slide rule or of tables of logarithms, 
and hence have been considered as little more than mathematical curios- 
ities. However, technological developments have promoted several of 
them into a much more active status and have also created additional 
need for reexamination and modification of other existing methods and 
for a search for new ones. 

Perhaps the most rapidly expanding phase of numerical analysis is that 
which deals with the approximate solution of partial differential equa- 
tions. But a basic understanding of the more involved problems which 
arise in that phase of the analysis depends strongly upon familiarity with 
similar problems which arise, in a somewhat simpler way, in connection 
with the processes of interpolation, numerical differentiation and inte- 
gration, and the approximate solution of ordinary differential equations, 
in which only one independent variable is involved. These are the topics 
which are to be treated, for the most part, in what follows. 

1.2. Approximation. In many of the problems which arise in numer- 
ical analysis, we are given certain information about a certain function, 
say f(x), and we are required to obtain additional or improved informa- 
tion, in a form which is appropriate for interpretation in terms of num- 
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bers. Usually f(x) is known or required to be continuous over the range 
of interest. 

A technique which is frequently used in such cases can be described, in 
general terms, as follows. A convenient set of n + 1 coordinate functions, 
say oo(x), di(z),..., Gn(x), is first selected. Then a procedure is 
invented which has the property that it would yield the desired additional 
information simply and exactly (barring inaccuracies in calculation) if 


f(x) were a member of the set S, of all functions which are expressible 


exactly as linear combinations of the coordinate functions. Next, use is 
made of an appropriate selective process which tends to choose from among 
all functions in S, that one, say yn(x), whose properties are as nearly as 
possible identified with certain of the known properties of f(x). In 
particular, it is desirable that the process be one which would select f(x) 
if f(x) were in S,. The required property of f(x) is then approximated by 
the corresponding property of yn(x). Finally, a method is devised for 
using additional known properties of f(x), which were not employed in 
the selective process, for estimating the error in this approximation. 
Clearly, it is desirable, first of all, to choose coordinate functions which 
are convenient for calculational purposes. The n + 1 functions 1, z, 
wv’, ..., 2", which generate the algebraic polynomials of degree n or 
less, are particularly appropriate, since polynomials are readily evaluated 
and since their integrals, derivatives, and products are also polynomials. 
Of much greater importance, however, is the natural requirement that 
it be possible, by taking n sufficiently large, to be certain that the set S, 
of generated functions will contain at least one member which approx- 
imates the function f(x) within any preassigned tolerance, within the 
interval of interest. It is a most fortunate fact that the convenient set 
Sn», which consists of all polynomials of degree n or less, possesses this 
property if only f(z) is continuous over that interval and the interval is of 
finite extent. t 
Principally for these two reasons, polynomial approximation is of wide 
general use when the function to be approximated is continuous and the 
interval of approximation is finite, as well as in certain other cases, and 
accordingly is to form the basis of the major part of the work which 
follows. Other types of approximations are considered in Chap. 9. 
Following the choice of the set S,, an appropriate selective process 
must be chosen in accordance with the nature of the available information 
concerning the function f(a). When the value of f(x) is known for at 
loast nm -+ 1 values of x, say Xo, 21, . . . , 2n, the simplest and most often 


| This fact was established in 1885 by a famous theorem of Weierstrass, which 
states, in fact, that any function which is continuous in a closed interval can be 
uniformly approximated within any prescribed tolerance, over that interval, by some 
polynomial, 
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used process consists in selecting, from the members of S,, a function 
yn(x) which takes on the same value as does f(x) for each of those n + 1 
values of x. Here again the choice of polynomials is convenient. For, 
whereas in the general case there may be no such function in S,, or there 
may be several, it is a well-known fact that there exists one and only one 
polynomial of degree n or less which takes on prescribed values at each 
of n + 1 points. In particular, if f(x) is indeed in S,, this process will 
then select it. Another useful process is described in Chap. 7. 

The final problem, that of devising an appropriate method of estimat- 
ing the error, is a troublesome one and cannot be discussed here. Clearly, 
the precision of the estimate must depend upon the amount of available 
information relative to f(x), and its usefulness will depend upon the form 
in which that information is supplied. In particular, if all available 
information is needed by the selective process, no error estimate is possible. 

It is of some importance to notice that, if S, is indeed taken as the set 
of all polynomials of degree n or less, then the Weierstrass theorem guar- 
antees only the existence of a member of S, which affords a satisfactory 
approximation to a continuous function f(x), over any finite interval, 
when n is sufficiently large. This does not imply that the particular 
member chosen by a particular selective process will tend to afford such 
an approximation as n increases indefinitely. Only when a dependable 
method of estimating the error is available can this question be resolved 
with certainty. Furthermore, even though it were possible to devise a 
selective process which had this property, it would not necessarily follow 
(for example) that the derivative of the selected polynomial y,(x) would 
tend to approximate the derivative of f(x), even though the latter were 
known to exist and to be continuous. Again, recourse must be had to an 
error estimate. 

1.3. Errors. Most numerical calculations are inexact, either because 
of inaccuracies in given data, upon which the calculations are based, or 
because of inaccuracies introduced in the subsequent analysis of those 
data. In addition to gross errors, occasioned by unpredictable mistakes 
(human or mechanical) and hypothetically assumed to be absent in the 
remainder of this discussion, it is convenient to define first a rownd-off 
error as the consequence of using a number specified by n correct digits to 
approximate a number which requires more than n digits (generally 
infinitely many digits) for its exact specification.t Such errors are fre- 
quently present in given data, in which case they may be called inherent 
errors, due either to the fact that those data are empirical, and are known 
only to n digits, or to the fact that, whereas they are known exactly, 
they are ‘“‘rounded” to n digits according to the dictates of convenience 
or of the capacity of a calculating device. They are introduced in sub- 


+ It is assumed here and elsewhere that the decimal notation is used, 
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sequent analysis either because of deliberate rounding or because of the 
fact that a calculating device is capable of supplying only a certain num- 
ber of digits. 

It is then convenient to define a truncation error, by exclusion, as any 
error which is neither a gross error nor a round-off error. Thus, a trunca- 
tion error is one which would be present even in the hypothetical situation 
in which no “mistakes” were made, all given data were exact, and 
infinitely many digits were retained in all calculations. Frequently, a 
truncation error corresponds to the fact that, whereas an exact result 
would be afforded by an infinite sequence of steps, the process is ‘‘trun- 
cated” after a certain finite number of steps. However, it is rather con- 
ventional to apply the term in the more general sense defined here. 

We define the error associated with an approximate value as the result 
of subtracting that approximation from the true value, 


True value = approximation + error, (1.3.1) 


with a remark that both this definition and that in which the algebraic 
sign of the error is reversed are used elsewhere in the literature. 

The preceding definitions can be illustrated, for example, by calcula- 
tions based on the use of power series. Thus, if a function f(x) possesses 
n continuous derivatives everywhere in the interval (a,x), it can be rep- 
resented by a finite Taylor series of the form 


fz) = fla) +2 @ — a) +E @ — a) + 


Po og (x qa)" 42% f? _ (% — a)” (1.3.2) 


where £ is some number between aand zx. If f(x) satisfies more stringent 
conditions, it can be represented by an infinite Taylor series 


pie) = sz) +O @ — a +O @ at --- 
4 IO) (@ — = ae nae yp (488) 


when |x — a| is sufficiently small. 

If f(x) is approximated by the sum of the first n terms of (1.3.3), then 
the error committed is represented by the last term (remainder) in (1.3.2). 
‘Thus, for example, if f(z) = e-* and a = 0, we have the relation 


et =1—2+4+ 52? — fa? + Er(z), (1.3.4) 
where the truncation error is of the form 
Ep = gye~'x4 (€ between 0 and 2). (1.3.5) 


If w is positive, the same is true of ¢, and, by making use of the fact that 


——————— 
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e-t is then smaller than unity, we may deduce that the approximation 


et = 1—a2+ ga? — 3a (1.3.6) 
is in error by a positive amount smaller than zzz‘. In particular, we have 
et =~ 1—3+78 — res = 108 (1.3.7) 


with an error between zero and zug. Since req = 0.00051, where the 
symbol = is used to signify “rounds to,” the truncation error is smaller 
than 5.2 X 10-4. If 44 is rounded to four places, to give e ~ 0.7160, 
the additional error introduced by the round-off is less than (but here 
very nearly equal to) five units in the first neglected place and hence 
smaller than 0.5 X 10-*. It follows finally that e+ ~ 0.7160 with 
an error of magnitude smaller than 5.7 X 10-*. However, whether 
e-t = 0.716 or 0.717 is not established. If each of the terms in (1.3.7) 
were rounded to four places before the terms were combined, a total 
round-off error as great as 1.5 X 10-4 would be possible. Finally, if the 
exponent 4 represented only an approximation to a value of x, which 
was not known exactly but which was known to lie, say, between 0.333 
and 0.334, the approximate maximum error due to this uncertainty could 
be determined by noticing that the change 5e~* corresponding to a small 
change 6a is approximately (de~*/dx) 6x = —e~* dx. Thus, if the num- 
ber 4 is in error by an amount between —3 X 10~* and +7 X 10-‘, the 
magnitude of the maximum corresponding error in the calculated value 
is about (0.716)(7 X 10-4) = 5 XK 10~*. 

The magnitude of the round-off errors could be reduced arbitrarily by 
retaining additional digits, and that of the truncation error could be 
reduced within any prescribed tolerance by retaining sufficiently many 
terms of the convergent Maclaurin expansion of e~*. The inherent error 
could be reduced only if the uncertainty of the value of x were decreased. 

It is useful to notice that, since the sign of the truncation error asso- 
ciated with (1.3.7) is known, the magnitude of the maximum possible 


error due to truncation can be halved by replacing the approximation 7¢3 


by the approximation t¢$ + g19z¢ = sese = 0.7163, with a corre- 

sponding truncation error accordingly known to lie between the limits 

+2.6 X 10. 
As an example of a somewhat different nature, we refer to the relation 


[F ti 4ab, 2) 3! yaa (= 1)! 
xz 





dt=- —- 54+ 5-G4+-°- 461 = 


t za a a4 
© px—t 
+ (nn [ wai dt (x > 0), (1.3.8) 


which is readily established by successive integration by parts. If we 
denote the left-hand member by /'(x), we can thus write 
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,(n — 1)! 
SY rid 


1 i! , 2! 3! 
LN 6S ee ioe eg ae eee (1.3.9) 


when x > 0, with a “truncation error” 
° pa—t 
Ep = (ent [ ai dt (1.3.10) 


Since x — ¢ is nonpositive in the range of integration, so that e*~ < 1, 
we may deduce that 


> dt 
|E'r(x,n)| S nt f ras 
ae ! 
or monies 2—™ = DS (1.3.11) 


Hence the truncation error is smaller in absolute value than the last 
term retained in the approximation and also is evidently of opposite sign. 
Further, since 1/é"+1 < 1/x"*! in the integration range, we see that 

|Er(x,n)| S af e—' dt = ae (1.3.12) 
so that the truncation error here is also smaller in absolute value than 
the first term neglected, and is clearly of the same sign. 

For a fixed number (n) of terms, the truncation error clearly is small 
when z is large and can be made arbitrarily small by taking x sufficiently 
large. However, for a given x, the error cannot be made arbitrarily small 
by retaining sufficiently many terms. In fact, we may notice that if 
the right-hand member of (1.3.9) were considered as the result of retain- 
ing the first n terms of an infinite series, then the ratio of the (n + 1)th 
term of that series to the nth would be —n/xz. Hence, the successive 
terms decrease steadily in magnitude as long as n < 2, but then zncrease 
unboundedly in magnitude as n increases beyond z. Thus the series does 
not converge for any value of x. 

Nevertheless, it is useful for computation when z is fairly large. Thus, 
if « = 10, the smallest term occurs when n = x = 10 and is given by 

91/10" = —3.6 X 10-5. Thus, the approximation afforded by reten- 
tion of 10 terms would be in error by a positive quantity smaller than 
| x 10-'. This would be the best possible approximation obtainable 
from (1.3.9), when « = 10, since retention of additional terms would 
increase the possible magnitude of the error. 

A divergent series of the type considered, for which the magnitude of 
the error associated with retention of only n terms can be made arbitrarily 
amall by taking a parameter x sufficiently near a certain fixed value x; 
(or sufficiently large in magnitude), and for which the error first decreases 
iw n increases but eventually increases unboundedly in magnitude with 


— ~~ 
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increasing n, when z is given a fixed value other than 2p, is often called an 
asymptotic series. An example of the former type, with x» = 0, is afforded 
by the relation 





[0 Ee wt te tater — att t+ (oUt + He 
9 1+2u 
when x > 0, which can be obtained from (1.3.8) by replacing x by 1/x 
and making the change of variables ¢ = (1 + zu)/zx in the integral, and 
for which it is true that 2-"E(2,n) — 0 as x tends to zero from the posi- 
tive direction, but |E(x,n)| > © asn— © for any fixed « ~ 0. 

For a representation of the form 


an 
f@) =a+S2+5+-°° ++ Een), 


it is usually stipulated also that 2"Z(x,n) is to tend to zero as |r| > ~; 
for an expansion of the form 

f(x) = a + a(t — Xo) + a(x — %o)? + ° + * + ant — to)" + E(zx,n), 

the additional requirement that E(x,n)/(« — xo)" is to tend to zero as 
x4 — xo is usually imposed. Equation (1.3.12) shows that (1.3.9) is thus 
asymptotic in the strict sense. However, the term is often applied some- 
what more loosely to expansions of more general type, which are not 

ily power series. _ 

n When # is fixed, the error frequently decreases rapidly, as additional 
terms are taken into account, until a point of diminishing return is 
reached, after which the error begins to increase in magnitude. In such 
cases, if the error is reduced within the prescribed tolerance before that 
point is attained, then the approximate calculation can be successfully 
effected. hs 

A great many of the expansions which are of frequent use - numerica. 
analysis are essentially of this type. For them, the term truncation 
error” generally applies only in the general sense of the definition given 
earlier in this section, and generally does not correspond to the result of 
truncating a convergent infinite process after a finite number of steps, but 
to the result of truncating a process which first tends to converge, but 
would ultimately diverge, at a stage before the tendency to diverge man- 
ifests itself. 

1.4. Significant Figures. The conventional process of rounding or 
“forcing” a number to n digits consists in replacing that number by an 
n-digit approximation with minimum error.| When this requirement 

i i is to be distinguished from the process of chopping, 
wad Useali ppc mee Pe all digits following the mth digit without modif ying 
the nth digit and which must be used when capacity limitations of a calculating device 
do not, permit the determination of more than n digits. 
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leads to two admissible roundings, that one for which the nth digit of the 
rounded number is even is generally selected. Thus 4.05149 = 4.0515, 
4.051, 4.05, 4.1, and 4; but 4.0515 = 4.052 and 4.05 = 4.0. With this 
rule, the error is never larger in magnitude than one-half unit of the place 
of the nth digit in the rounded number.{ The errors introduced in the 
rounding of a large set of numbers, which are to be combined in a certain 
way, also tend to be equally often positive and negative, so that their 
effects often tend to cancel. The slight favoring of even numbers is 
prompted by the fact that any subsequent operations on the rounded 
numbers are then somewhat less likely to necessitate additional round-offs. 

Each correct digit of this approximation, except a zero which serves 
only to fix the position of the decimal point, is called a significant digit or 
figure. Thus, the numbers 2.159, 0.04072, and 10.00 each contain four 
significant figures. Whether or not the last digit of 14620 is significant 
depends upon the context. If ‘‘a number known to be between 14615 
and 14625” is intended, then that zero is not significant and the number 
would preferably be written as 1.462 X 10‘. Otherwise the form 
1.4620 X 104 would be appropriate. 

More generally, if any approximation WN to a number N has the prop- 
erty that both N and N round to the same set of n significant figures, 
then N may be said to approximate N to n significant figures. Thus, if 
N = 34.655000 - - - and NW = 34.665000 --- , then n = 4. Clearly, 
the error N — N cannot exceed one unit of the place of the nth digit, but, 
as this example illustrates, the error may take on that maximum value. 
On the other hand, if N + 38.501 and N = 38.499, then n = 4 but 
n ~ 2, in spite of the fact that the error is less than three units in the place 
of the fifth digit. This point is of practical importance only in that it 
illustrates the fact that, no matter how accurately a calculation is to be 
offected, the result of rounding the calculated value to n digits cannot be 
guaranteed in advance to possess n correct digits, but may differ from the 
rounded true value by one unit in the last digit. 

It may be seen that the concept of significant figures is related more 
intimately to the relative error, 


true value — approximation 
d 


Relative error = 
true value 





(1.4.1) 


than to the error itself. In order to exhibit the relationship more spe- 
cifically, it is useful to define N* and r such that 


N = N* X10" where 1 < N* < 10, (1.4.2) 
| It should be noticed that, when 9.95 is rounded to 10.0, the result still contains 


three correct digits; the error amounts to one-half unit in the third figure of the rounded 
number, but to five units in the third figure of the original number, 
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where r is an integer and hence is that integer for which 10° < N < 10°*, 
when N is positive and not an integral multiple of 10, as will be assumed 
here. Thus N* = N when 1 < N < 10, = N/10 when 10 < N < 100, 
— 10N when 0.1 < N <1, and so forth. If we write = N — N and 
R = E/N, for the error and relative error, respectively, and suppose that 
N approximates N ton significant figures, so that 


|E| = 107", (1.4.3) 


it then follows that 
r—n+1 n+1 
\E| 10 _ OEE: (1.4.4) 


IRl="y SNF x10 ~ NF 





In particular, we have |R| < 10-"*". 
Further, if N is the result of rounding N to n significant figures, (1.4.3) 


is then replaced by the stronger estimate 


\E| < 5 X 10", (1.4.5) 
and there then follows 
\R| < nv x 10>", (1.4.6) 
In particular, |R| <5 X 10™. 
If we also write 
E =w X 107", (1.4.7) 


it follows that w is the error expressed in units of the place of the nth digit 
of N, and we have also 
E NR 


= gai = gene N*R X 10". (1.4.8) 


Suppose next that two numbers N, and Nz are each rounded to n 
significant figures, and that the corresponding maximum error in the 
product P = N,N2 is required. We notice first that, if R(P) refers to P 
and Ri, Rz to Ni, No, there follows 

R(P) _ NiN2 — NiN2 =1-(1-R)(—-—8®) =Rit R, — RiRo. 
NwNe 
Thus we see that |R(P)| is largest when Ri and R» are negative, and, from 
(1.4.6), there follows 
1 1 25 
< —— pane n —-2n 
|R(P)| Ss 5 (We + vi) x 10 + yeyg X 107", 


and hence, by using (1.4.8), we obtain 





* 1 N * 
oP) 3 CON" + wa) + 3 oNeNy X10 49) 
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Since (NiN2)* = N*N* X 10-*, where p is el i 
; either 0 - 
member of (1.4.9) is of the form Nee eae 


107° * * 
g (Nit N3+5 xX 10), 





and the most unfavorable cases are those f i a i 
constraint, the function ¢(Nf,N#) = 4(N eee iGaas i 
considered only for 1 < N¥ < 10, 1 < N¥ < 10, and 1 < N*N* 2 10, 
and clearly cannot take on a maximum value in this open frien The 
maximum value of ¢ on the boundary of the region is easily seen to oc : 
when either Nf = 1 and N¥ = 10 or N*¥ = 10 and N*¥ = 1. Thus ie 
right-hand member of (1.4.9) cannot exceed the ifeaiina velGecenes 0 i. 
ing to (NiN2)* = 10— and either N* = 1+, N¥ = 10— or N* =| = 
N*¥ = 1+, and there follows ae eer 


|on(NiN2)| < 42 +8 X 10 <6, (1.4.10) 


where w, is the error expressed in units of th igi 
ee e place of the nth digit of the 
This means that if two numbers are round gna 
ed to n significant figur 
the product of the rounded numbers differs from the true product by ie jee 
a units in the place of its nth significant digit. In illustration, when 
N 1 = 1.05+ and N» = 9.45+ there follows NiN2 = 9.9225+, =e 
Ny oe N : are rounded to two significant figures to give N; = 1.1 and 
» = 9.5, there foll V iN, = i i 
ee a ows NiN2 = 10.45. Thus, in this extreme case, 
When N, = N; = VN, the worst (limiti i ion i 
ng) situation is that in whi 
(N*)? = (N?)* = 10—. Thus there follows se 


Jon(N2)| < 10!+ 8 X 10>" <4, (1.4.11) 


so that the square of a number rounded to n digits differs from the square 
of the unrounded mumber by less than four units in the place of its nth digit. 
More generally, if we consider P = NiN2 + + + Nm, we find that 


R(P) =1-[(d — Ri)(1 — BR) +++ A — R,,)] 
and hence 
(Ni » ++ Ny)* 
jon(P)| s Ae NN" + RAD + [Ra + + + Ral) = 1 
(Ni + + Nm)* On On a 
cE TO) (4) O48) 
(1.4.12) 
where a, = 5 X 10. (1.4.13) 


Here the worst situation is that in which m — 1 of the m numbers N¥ 
k 


_—_- <n 
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are 1+ and the remaining one is 10—, such that also 

N*e- +> NX = (Ni ° . - Nn)* = 10-. 
Thus there follows, from (1.4.12), 


5 = an 
Corresponding numerical bounds on the quantity Jon(Ni * °° N~n)| are 
given in Table 1.1. 


TABLE 1.1 


“| 4 23 4 6 8 10 
n 


6 11 17 29 42 56 
6 11 16 26 36 46 





In the special case when Ny = No = ++ * =Nn = N, there follows 


lon(N™)| S Sha I(: + ne) = |, (1.4.15) 


and the worst situation is that in which (N*)™ = (N™)* = 10-, when 
m is a positive integer, so that 





lon(N™)| <2 (1 + 10a," — 1} (m—1,2,...). (1.4.16) 


Numerical bounds on |w,(N™)| are given in Table 1.2. 


TaBLe 1.2 


Nita ee sw 
n 


8 12 23 35 48 
7 12 21, 31 41 





2/05 4 

23|)0.5 4 

When P = N™, with m = 1/p the reciprocal of a positive integer, so 

that the operation involved is that of root extraction, the relation (1.4.15) 

again holds, but here the worst case is that in which N = aaa 

where k is any integer, so that (N”)* = 10'™+ and N* = 14, in 
accordance with which there follows 


Jon), <2 x 10-1 +a)" = (mH) LAID 


Numerical bounds on |w,(N/”)|, where p is a positive integer, are given 
in Table 1.3. 
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TABLE 1.3 


Npi 2 3 4 8 wom 
n 


22/0.50 0.79 0.78 0.71 0.47 0.28 0.15 





The given bounds apply for all N, but may be quite conservative in 
any specific case. Thus, if it is known only that |N — 9.61] S 0.005, 
then it can be verified that |~/N — 3.100] S 0.0009, whereas Table 1.3 
gives a bound of 0.0078. (Here the guaranteed accuracy of the calcu- 
lated result is greater than that of the basic data.) Still, none of the 
bounds can be appreciably lowered, since each is nearly attained in some 
case. 

In illustration, we may note that, if N = 1.445 and if N® is approx- 
imated by (1.44)® = 8.916 = 8.92, the result differs from the true value 
N® = 9.103 = 9.10 by about 19 in units of the third digit. Table 1.2 
gives an upper limit of 21. The number (106.4)! should be reliable to 
three significant figures, according to Table 1.3, with the fourth digit in 
error by no morethan 1. The calculated value rounds to 4.7385, whereas 
(106.35)? = 4.7378 and (106.45)? = 4.7393. The maximum error is thus 
about 0.8 units of the place of the fourth digit, as is just admitted by 
Table 1.3, and the last digit of the rounded four-place value, 4.738, is in 
error by not more than 1. However, whereas the value actually cal- 
culated is in error by an amount not exceeding 0.8 * 107’, as predicted, 
the rounded value may be in error by 1.3 X 107%. 

The calculated value of the product 


(3.658) (24.765) (1.4345) (72.43) 


certainly will be in error by less than 16 units of the place of its fourth 
significant digit, in virtue of Table 1.1, under the assumptions that each 
factor is correctly rounded to the digits written and that sufficiently many 
digits are retained in the calculation itself. However, since the second 
and third factors each involve five significant figures, their product alone 
will be correct within 6 units of its fifth digit, so that actually the max- 
imum error is the same as that associated with the product of three four- 
digit numbers and hence will be less than 11 units in the place of the fourth 
digit. Clearly (contrary to advice sometimes given) the procedure of 
deliberately rounding each of the factors to four digits before the multi- 
plication would be a wasteful one, since it thus would increase the max- 
imum possible error. Multiplication actually yields the calculated value 
9.412 X 10° (to four digits), while the largest and smallest possible values 
of the true product are found to round to 9.415 X 10? and 9.410 X 10%. 
Thus the maximum error here is only 3 in the fourth digit. The result 
of rounding each factor to four digits before multiplying rounds to 


ee———— 
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9.407 X 10%, which hence certainly is in error by at least 3 in the fourth 


igi i i h as 8. 
digit, and which may be in error by as muc 
‘18. Error Bounds. When any function f(N), with a continuous deriv- 


ative, is evaluated with N replaced by an approximation N, the relation 


f(N) — f(N) = (N - N)f'(n) (n between N and N) (1.5.1) 


permits us to deduce that 
|E(f(N))| wages (1.5.2) 
and ecgcny)| = Ete BON) (1.5.3) 


Analogous results are readily obtained in cases when several independent 


variables are involved. 
In illustration, if f(N) = logio N, there follows 


|E(logio N)| S pet |E(N)| [In between N — E(N) and N + E(N)] 


and hence, if N = 1 and |E(N)| < i, 


44 
| (ogi N)\ Sperm ANDI < IEC), (1.5.4) 


so that the error in the common logarithm is smaller than the error in its 
argument, when that argument exceeds unity. On the other hand, 


s10%)| = soe, BOL 


and hence, if |E(N)| < 3, 
|R(10%)| S 2.31 X LOE I|E(N)| < 8|E(N)|. (1.5.5) 


Thus, the error in 10%, expressed in units of the place of its nth significant 
figure, is less than 8|E(N)| X 10". Henee, if the error in the pene 
logarithm is smaller in magnitude than 1 in units of the nth ee 
place, then the antilogarithm is in error by less than 8 in units of its nt 
significant figure and hence is correct to at least n — 1 significant figures. 

As a further illustration of the use of (1.5.1), we next investigate the 
degree of indeterminacy of the quantity T 


log sin 1.412762, 
+ The notation log u, with no base specified, is to be used consistently to denote 


log. w; the arguments of trigonometric functions are always to be expressed in radians 
unless degrees are explicitly specified, 
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under the assumption that the argument is a rounded number. The 
use of (1.5.1) gives the bound 


|E| S$ (5 X 10~)|cot nlmax  (1.4127615 <  < 1.4127625) 


on the inherent error E, and, since 0.17 > cot x > 0.15 for 1.41 <a < 
1.42, there follows |E| < 0.85 X 10-’, so that the desired quantity is 
determinate to within less than one unit in the seventh decimal place. 

In the linear processes of addition and subtraction, the error in the result 
is merely the algebraic sum of the errors in the separate terms, and the 
magnitude of the maximum error is the sum of the magnitudes of the 
component errors. Thus, whereas in multiplication and division we are 
concerned principally with ratios of errors to true quantities, and with 
the number of significant figures, and the absolute position of the decimal 
point is of importance only in fixing the magnitude of the end result, in 
addition and subtraction the errors themselves are the important quan- 
tities, significant figures are involved only incidentally, and the orienta- 
tion of a digit sequence relative to the decimal point is of importance 
throughout the calculation. 

Thus, if k numbers (positive or negative) are each rounded to n decimal 
places, so that each is in error by an amount less than 5 X 10-*—! in mag- 
nitude, the magnitude of the maximum error of the sum is clearly 5k X 
10-"-!, corresponding to the situation in which the signs of the errors 
ure such that they combine without cancellation. Accordingly, the 
result can be in error by as much as k/2 units in the nth decimal place. 

lormal addition assigns to the sum 


56.434 + 251.387 — 2.6056 + 84.674 — 396.06 + 7.0228 


the value 0.8352. However, if each number is correct only to the five 
significant figures given, the error in the result can have any value 
between the limits +0.0111, so that the result would be recorded as 
().84, with the last digit in doubt by two units,t and only one truly sig- 
nificant digit remains. Rounding all of the numbers to the two decimal 
places which are in common, before addition, would lead to the result 
().82, but would increase the error limits to +0.03. Rounding each of 
the more accurate numbers to one place beyond the last place of the least- 
necurate one gives 0.835 with error limits +0.012, so that the recorded 
ontry is again 0.84 (or 0.84), with the iast digit in doubt by two units, 
«ad is a procedure which is generally to be recommended in such cases. 

A somewhat similar situation, in which the outcome is, however, 
reversed, is of some importance. Tables of functions often provide a 


| The notation 0.8, is often used to indicate that, whereas the 4 is not significant, 
(he maximum error associated with 0.84 is less than (or probably less than) that 
pesociated with 0,8, 
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column of differences between successive entries, to facilitate linear inter- 
polation, according to the formula 


flo + Oh) ~ flee) + f(a) — fed] O<e<1), (56) 


where 2» and 2, are successive tabular arguments and h = 21 — 2o. In 
constructing such a table, in which (say) all entries are to be rounded to 
a certain number of decimal places, the question arises as to whether the 
number tabulated for the difference f(ar) — f(xo) should represent the 
rounded value of the difference or the difference between the rounded 
values, in those cases where these values differ. Intuition perhaps would 
recommend the former procedure, since it appears to make use of addi- 
tional information. However, if « represents the maximum round-off, 
the maximum error in that case is clearly (1 + 0)e, whereas, since in the 
second case the right-hand member is properly to be considered in the 
form (1 — 6)f(#0) + of(a1), and since 0 < 8 < 1, the maximum error in 
that case is seen to be (1 — b)e + He =e. Thus the maximum error is 
less if the difference of the rounded values is used (for a more detailed 
discussion of this question, see Ostrowski [175]t). In particular, the 
user of tables which do not explicitly list differences need not regret the 
fact that he is forced to employ that procedure when using (1.5.6). 
The truncation error associated with that formula is considered in the 
following chapter. 

The loss of significant figures in subtraction is one of the principal 
sources of error in numerical analysis, and it is highly desirable to arrange 
the sequence of calculations so that such subtractions are avoided, if 
possible, or so that their effects are brought into specific evidence. As 
a simple example, in calculating ab — ac = a(b — c), where b and ¢ are 
very nearly equal, the products ab and ac may have many of their leading 
digits in common, and the number of significant figures which must be 
retained in each product, in order that sufficiently many significant 
figures will remain in the difference, can be determined only after both 
products have been evaluated. This dilemma is avoided if b — c is cal- 
culated first. Naturally, if a, b, and c are specified only to a certain num- 
ber of significant figures, and if no round-offs are introduced, the order of 
the calculations is irrelevant from this point of view, and the correspond- 
ing degree of uncertainty in the result merely must be accepted. 

Frequently it is possible to exploit special properties of functions 
involved in the analysis. Thus, if a and 8 are nearly equal, it is conven- 
ient to replace log b — log a by log (b/a), sin b — sin a by the product 
2 sin 3(b — a) cos x(a + 5), and +/b — v/a by (b — a)/(/a + Vb). 


For example, if 44C < B?, the quadratic formula is inconvenient for 


+ Numbers in brackets refer to items of the bibliography, Avpendix B, 
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i determination of the smaller root of the equation Ax? + Br +C =0 
n this case, when B > 0, it is desirable to replace the usual formule 
a. = (—B + WB? — 4AC)/24 by the equivalent form 


_ —20 
B+ +/B?— 440 


for a specific calculation, or to write 


BA .7pt = 4AG = HB (1 - NE a 140) 
2 


in the original form and to expand the result by the binomial theorem to 


rive 
ne AC 
1 = - $4404 reat , 


when the dependence of x; on the literal parameters is to be studied 
1.6. Random Errors. If 1000 positive numbers, each rounded 5 
decimal places, were added, the total error due to saunter could a : 
to 500 units in the last place of the sum. Whereas this maximu sees 
vould be attained only in the case when all numbers were ands i abe 
name direction, by exactly one-half unit, the possibility of its ae : 
forces us to accept its value as the least upper bound on the possible aoe 
However, the price of certainty in such a case is a high one, and in A bet 
situations it cannot be tolerated. Furthermore, in a préat Satbee 
practical cases certainty cannot be attained. Thus each memb ‘oh y 
wot of 1000 numbers, to be added, may itself represent the mean of a i 
of empirical values of a physical quantity, in which case one gen ik 
oannot platen? that the error associated with it is less ee a ; 
i sae aos can only estimate the probability that this is the sind : 
Fahy : hy ch cases it is assumed that the errors are symmetrically dis- 
ibuted about a zero mean and that, in a sufficiently large set of mea 
nents, the probability of the occurrence of an error between x and sgh 
in, (0 a first approximation, of the form ade 


U1 











1 
o(x) dx = peuae: 
Re e dx, (1.6.1) 
where o is a constant parameter, to be adjusted to the observations. The 


lunetion ¢ is called the frequency function of the distribution. The prob- 


ability that an error not ex i 
y Ww exceed x algebraically i i 
normal distribution function : eae tes 


i x wi 1 * ‘ 
(a) ih y o(t) dt = ral ‘ dt; (1.6.2) 
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the numerical coefficient in (1.6.1) having been determined so that 

(0) = 1, 

0 Kf c) 
i) dt = e- #207 dt = 1 1.6.3 

o(t) yn ae ) (1.6.3) 


in accordance with the requirement of unit probability that any error 


lie somewhere in (— ©,®). 
Further, the probability P(x) that an error chosen at random lie 
between —|z| and +/|z|, that is, that its magnitude not exceed |x|, is clearly 


given by 
|x| |x| 
Pia) = (le) - &(- le) = [oat =? [7 oa 


|x| 
or P(x) = v2 i e-!?/20* dt, (1.6.4) 
Jno 0 





whereas the probability that it exceed \x| in magnitude is Q(a) = 1 — P(e). 
Equation (1.6.4) can also be written in the form 
) (1.6.5) 
o 


2 Iz\/V 20 \a| 
P(x) = — e-*' ds = erf = 
Va J (A 
in terms of the ‘‘error function.” 

Details must be omitted here with respect to the wide class of situations 
in which the use of this so-called normal-distribution law is justifiable, 
but the literature on this subject is extensive (for example, see Feller [76]). 
In particular, even though the ‘‘frequency distribution” of the errors in 
a single quantity may not be capable of good approximation by @ normal 
frequency distribution, of the form specified by (1.6.1), it generally is 
true that, when many such independent component errors are com- 
pounded, the resultant distribution can be so approximated. 

The parameter o is called the standard deviation of the distribution. 
It is easily seen that the points of inflection of the curve representing 
$(e) lie at distance o on each side of the maximum at e = 0. The param- 
eter h = 1/(V7 Qc) is called the modulus of precision and is a measure of 
the steepness of the frequency curve near its peak at the origin. 

If « is a random variable, the mean value of any function g(e), relative 
to the assumed distribution, is given by 


(g(€))mean = ic o(e)g(e) de = ae | e/g (e) de, (1.6.6) 


t this integral exists. In particular, since o(e) 
the mean value of e itself 








under the assumption tha 
is an even function of ¢, we verify directly that 
then is indeed zero, and we find also, for example, that 
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mean — : =. 2 
Jel 2 [ edp(e) de = yee (1.6.7) 
and 2 = . 
(e ) eta = 2 [ e*(e) de = a”. (1.6.8) 


Mean values of higher powers 
of be s 
prectananaieis p \e| can be expressed similarly, in terms 
.; ae oe PRAEIOVO could be determined in such a way that any one 
: : - i : moments,” thus calculated for an assumed normal distribution 
: be ‘ sete ses the el Sea moment of the distribution netunily 
ration, if that moment could be calculated o i 
| r approximated. 
ie happens that the choice of the second moment leads to ae most a 
’ - a oe and also is recommended by certain theoretical consid- 
ria ee a eign the parameter o of the approximating normal 
.6.1) in such a way that it is equal to the squ 
mean of the squared errors in the true distribution eae 
? 


te Saat (1.6.9) 


In general, erms for the entire distribution can be estimated only from a 
sample of, say, the deviations of k measurements from their mean val 
and an appropriate estimate is then afforded by the formulat = 





1 
€rmMs ~ vi (ef + 3 ae eae EE é?). (1.6.10) 


Scie ecu such an approximation to o, one can make use of 
ae ie é o estimate the probability that the magnitude of a random 
er : ed (or not exceed) a certain specified amount. A few useful 

ies of 1 — P are listed in Table 1.4. Thus, the probability of an error 


TABLE 1.4 


€ 





1 — P(e) 


€RMS 





0.674 0.500 
0.842 0.400 
1.000 0.317 
1.036 0.300 
1.282 0.200 
1.645 0.100 
2.576 0.010 





1 A theoretically better estim i 
, ate, which tends to take into acc 
otice 7 me mean of the observations from the unknown true ach, . ae 
‘ss aed A te "44 a ae eee: ae modification is of sractionl iia 
' sity a eka 
Hieelf may be open to question, ich cases the validity of the statistical analysis 
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of magnitude greater than erms is 0.317. Only 20 per cent of the errors 
should exceed 1.282erus, 10 per cent should exceed 1.645enns, and 1 per 
cent should exceed 2.576enus, if the distribution is sufficiently nearly 
re aia 0.67449¢ is often called the probable error of the distribu- 
tion. It should be noticed that this is merely that number which should 
be exceeded by the magnitude of half the errors; it is In no snk the most 
probable error, as the name tends to suggest. The quantity o° 1s often 
ce. 

7 ee aepeaston (1.6.10) were calculated for a large number of 
sets of samples, each containing k errors chosen at random from es a 
distribution, and if the mean of the estimates were selected as t 1e : 

approximation to the true erus, the deviations of the various estimates 
from this best one would also be normally distributed, toa first approx- 
imation, with an RMS value of erus/*/2k, when k is sufficiently wale 
This fact is often useful in estimating the reliability of the estimated value 
# Ne suppose that ¢ is the sum of two independent errors u Bae v, each 
of which varies about a zero mean. Then the mean value of e? is the sum 
of the mean values of wu, 2uv, and v?, But, since u and v are independent, 
the mean of wv is the product of the means of wu and v and hence is zero. 
Thus there follows fats xh Se eu 


This argument generalizes to show that the RMS value of the sum of n 
independent errors (each having a zero mean) is the square root of the sum 
t errors. 

of the squares of the RMS values of the componen 8 
te pies be shown (see Prob. 24) that the normal-distribution law has 
the property that, if u and v are independent and normally distributed, 
with standard deviations ou and oy, then e = wu + 018 also peo dis- 
tributed, with standard deviations = V/o2 + Oo: Thus if, in accor ance 
with (1.6.9), we identify ou with unas and o, With vams, it will follow also 
that cup» = (w+ 0)Rms. 

In illustration, if each of the numbers in the sum 





426.44 — 43.26 + 2.72 + 9.61 — 104.26 — 218.72 


the mean of a set of observations, and if the (approximate) 
in associated with each is, say, 0.05, then the formal sum 72.53 
would possess an RMS error of ~/6 (0.05) = 0.12. Such a Se 
often recorded as 72.53 + 0.12, although some writers use the probable 
error (0.674) (0.12) = 0.07, and write 72.53 + 0,07, while still others use 
the notation N + d to indicate that d is the maximum error in N (which 
would be undefined in the present case), 
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If we consider the error e which arises from rounding a number to n 
decimal places, it is clear that the distribution of values of e will not be 
well approximated by any normal distribution, since here the frequency 
function has the constant value 1/(2\e|max) When |e] < |elmx = 5 X 10-7"! 
and the value zero otherwise. However, the distribution function corre- 
sponding to errors which are (exactly or approximately) linear combina- 
tions of many such errors generally will be appropriate for approximation 
hy a normal distribution function. Thus, in such cases, the error anal- 
ysis may be based with some confidence upon the result of treating the 
individual errors as though they were normally distributed. (See Prob. 
Yb.) : 
lor this purpose, we may notice that if x takes on all values between 
} and 4, and if all those values are equally likely, the RMS value of z is 


(t L x? dx)" =4-+/3 = 0.2887. 
-3 
Ilonce, if € ts round-off error due to rounding to the nth decimal place, there 


Jollows 
érms = 0.2887 X 107”. (1.6.12) 


‘hus, if & numbers are each rounded to n decimal places, the error in 
(he sum of the results can be considered to be normally distributed, with 
an RMS value of 0.2887 ~/k X 10-*, if k is not too small. 

In particular, when 1000 such numbers are added, the RMS error in 


(he sum is less than 10 units in the nth place. According to Table 1.3, 
the probability of an error of 17 units is less than 0.1, and the odds are 
0) to | that the error will not exceed 26 units. Nevertheless, an error of 


00 units in the nth place is indeed possible. 

In order that such analyses be reliable, it is essential that the round- 
olla be effected without bias. Methods are available in the literature for 
(he purpose of minimizing the possibility of its occurrence (see Forsythe 
(70). 

1,7, Mathematical Preliminaries. In this section, we list certain ana- 
lytical results to which reference occasionally will be made in the sequel. 
Hrools of most are omitted. 

l'irat, it may be stated that, in most of the following chapters, it is 
mipposed that all functions dealt with are real and continuous in the range 
vonaidered and, in addition, that they possess as many continuous deriva- 
liven as the analysis may require. 

The basic fact that a function f(x) which is continuous fora S « S b 
fakes on each value between f(a) and f(b) is intuitively ‘‘obvious,” but is 


mapable of rigorous proof. ‘Two immediate consequences of this result 
ave the following: 
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TuxoreM 1. If f(x) is continuous for a S x < b, and if f(a) and f(b) 
are of opposite sign, then f(é) = 0 for at least one number £ such that 
a<é<b. 

THrorEM 2. If f(x) is continuous for a < x < b, and if \; and Az are 
positive constants, then Arf(a) + Aef(b) = Qi + re) f(é) for at least one 


ésuch thata SES b. 
If also f’(x) exists and is continuous for a < x < 6, two additional 


results can be established: 

THorEM 3. If f(x) is continuous for a Sz <b and f’(z) is contin- 
uous for a < x < b, and if f(a) = f(b) = 0, then f'(é) = 0 for at least 
one & such that a < — < b. (This is Rolle’s theorem.) 

TuroreM 4. If f(x) is continuous for a < x < b and f’(z) is contin- 


uous for a <a <b, then f(b) — f(a) = (b — a)f'(&) for at least one & 
such thata <é <b. (Thisis the mean-value theorem for the derivative.) 
In the following statements, it is assumed that the integrals involved 


exist and that b > a. 
Tororo 5. If |f(x)| < M in (a,b), where M is a constant, then 


| P9@) dx| < [” \f@)| dz s MO — a). 
Tyrorem 6. If f(x) is continuous for a S x < b, then 
[[1@ ae = & - of 
for at least one ¢ such that a < — <b. (This is the first law of the mean.) 


Taorem7. If m S f(x) S Mand g(a) is nonnegative, foras<azsb, 
then 


m [? g(x) dx s [’ f@g@ de = M [? 9@ ae. 


Turorem 8. If f(x) is continuous for a <x <b and g(x) does not 


change sign inside (a,b), then 
[P feado(e) dx = fe) f? 9) ae 


for at least one — such that a < §< b. (This is the second law of the 


mean.) 
The three following theorems with relation to integrals involving a 


parameter are of frequent use: 
Turorem 9. If a and b are finite constants and F'(a,s) is continuous 


in x and s, then 
; b b 
lim f, F(a,8) da = [ F'(c,8) ils, 


INTRODUCTION 23 


TuroreM 10. If aand bare finite constants and if dF /dx is continuous, 


then 
ad [* > OF (x,s) 
zi Ie as [ ae 


TuroreM 11. If ais a finite constant, u is a differentiable function of 
w, and dF /dz is continuous, then 


d [* “ OF (a,s d 
ah F(a,s) a= f OPS) as + F(a) Se 


linally, a useful result relevant to repeated integration may be stated 
an follows: 


Turorem 12. If F,(x) denotes the result of integrating F(x) succes- 
ively n times over (a,x), then 


F,(“) = woni i (« — s)"1F(s) ds. 


The truth of each of these assertions, except perhaps for the last two, 
jn nearly self-evident, and the details of their proofs are rather easily 
“\pplied once the preliminary basic properties of continuous functions 
are established. 

The validity of Theorem 11 follows from the fact that, if we write 


I(v) = , F(x,s) ds, there follows 


I(n + Ax) — I(x) = it [F(a + Az, s) — F(a,s)] ds 
-+- i F(a + Aa, s) ds 
= CE F,(&,8) ds) Az + F(a + Az, n) Au, 


whore £ is between 2 and x + Az and 7 is between u and u + Au, in virtue 
af ‘Theorems 4 and 6, and hence 


I(x) = ii "P, i ae 
(x) im i (é,s) ds + lim F@ + Az, n) ae 


- / F,(2,s) ds + F(x,u) au, 


i virtue of the content of Theorem 9. 

Theorem 12 can be established by successive integration by parts, or 
verified by the use of Theorem 11, making use of the facts that, from the 
definition, there follows /;,(x) = F,-1(«) and F,(a) = Ofork =1,2,... 
wand Fi(x) = Fo(x) = F(a). aa 

Thin result is useful in deriving the finite Taylor series, with an error 
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term expressed in a form which is often more useful than that given in 
(1.3.2), and also for deriving the form given there. For if we write 


dfx) (1.7.1) 


F(a) ={O@ = Ge 


and use the notation of Theorem 12, the results of integrating the equal 
members successively over (a,x) are seen to be 


Fy(a) = fo-P@) — FP), 
Fala) = J? (@) — Fo-P(a) — @ = @fr-@), 


Fate) = fo-(2) ~ fo) — (@ — ayoM(a) — EEE FM 


and finally, after n integrations, 
ns 2 
Pala) = fle) - fla) — @ — af'ca) — 25% pr(@) - 
_ =O"? fun | 
a fo-v(a). (1.7.2) 
Thus, after a transposition and a reference to Theorem 12, we deduce 


that if the nth derivative of f(x) exists and 1s integrable throughout an inter- 
val including x = a, then in that interval there follows 


ie) = fe) +£'a(@ -— 2) +P @— ay to 


(n—-1) 
ee a ~e) (x — ay + Ez) (1.7.8) 


where E(x) = aon i, (a — s)r¥f(s) ds. (1.7.4) 


Further, since (a — s)" does not change sign as s varies from a to 2, 
we can invoke the second law of the mean (Theorem 8) to rewrite (1.7.4) 


in the form 
(n) £) * ed 
wey = 228 | (c — s)""'ds 


_ LB = ay 


(£ between a and x), (1.7.5) 


under the additional assumption that f” (x) is continuous. Whereas the 
form (1.7.5) has the advantage of simplicity, the form (1.7.4) is often 
preferable because of the fact that it is explicit, while (1.7.5) involves @ 


parameter which is known only to lie between a and «. 
A useful generalization of the Taylor-series expansion (1.7.3) can be 


obtained by starting with the representation 


nel 
F(t) = FO) + ), eth + EB, (1.7.6) 
Awl 
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1| FO n [an 
where ea i” | d"F(t 
r aoe | | dt* ie E= n! m Nig (1.7.7) 








with + between 0 and ¢, and writing 


t=g(t) — g(a), Ft) = f(a), (1.7.8) 
under the assumption that 

g(x) #0 (1.7.9) 
over some interval J including x = a, so that g(x) — g(a) increases or 


decreases steadily as x increases over J. Th : . 
. e resul : 
belie tha tin sult of this substitution 





f(x) =f@ + . exlg(x) — g(a)|* + E, (1.7.10) 
where -_ 
Pst Lod |" = n * 
w= alleal Ol: 2 Oa [peal |. 
(1.7.11) 


and where £ lies between a and z, when z is in J, under the assumption 
P* f(x) and g(x) are continuous and g’(x) ~ 0 in J 
{ we define a sequence of auxiliar i 
y functions 
recurrence formula Cierra 





= a,_1( ) 
| ax(z) = “SS (b= 1,2) +; (1.7.12) 
with a: = 
it follows that @) f(z), (1.7.13) 


ce = Foul Eu 
# = Gyula), = —an(t)lg(x) — g(a). (1.7.14) 


n! 

The expansion (1.7.10) is often known as a Burmann series and is 
tiwoful when a certain value of a function g(x) is known and the corre 
sponding value of a second function f(x) is required. The special ae 
when f(x) is identified with z itself is of most frequent occurrence ° 

It can be shown (see Whittaker and Watson [235]) that the evefficiont 
*, Cun also be expressed by the formula 


we fe ae sorta P= a i 

onal VO ls =a) |. O79 
While this is the form usually given, the use of the form given in (1.7.11) 
wr (1.7.14) often leads to a somewhat less involved calculation, partic- 


_ tilarly when (2 — a) cannot be explicitly factored from g(x) — g(a). 
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Lastly, mention should be made of the so-called fundamental theorem 
of elementary algebra, namely, that any polynomial} other than a con- 
stant possesses at least one zero, the usual proofs of which depend upon 
results established in the theory of analytic functions of a complex var- 
iable. Elementary treatments asswme the truth of this theorem and 
deduce easily that any polynomial of degree n possesses exactly n zeros, with 
the convention that repeated zeros are to be counted a number of times 
equal to their multiplicities. This last result will be of basic importance 
in the sequel. 

1.8. Supplementary References. The bibliography (Appendix B) lists 
many of the existing general texts on numerical analysis, together with 
a selection of collateral text and journal references and of certain sources 
of relevant mathematical tables and formulas. Norlund [13] and Whit- 
taker and Robinson [20] are good sources of references to the classical 
works. Graphical methods, which are not treated here, are included in 
Lipka [8], Von Sanden [16], and Willers [21]. References [60], [73], [104], 
and [187] provide information concerning modern, large-scale calculating 
machines. For comprehensive treatments of the general theory of 
approximation, see Jackson [123], de la Vallée Poussin [229], and Walsh 
[232]. An illustration of the precise analysis of round-off errors in 
machine calculation is presented by Householder [3] (chap. 1). The 
general mathematical bases of probability and statistical analysis may 
be found in texts such as Cramér [63] and Feller [76]; associated tech- 
niques and applications are dealt with by Fisher [78], Hoel [113], Mood 
[162], and others. 


PROBLEMS 
Section 1.2 


1. Determine Ao, Ai, and Ag such that the function y(z) = Ao + Ait + Axx? and 
the function f(x) = 1/(1 + 2) have each of the following sets of properties in common: 

(a) (0), f(@), FQ). 

(b) £(0), f/(0), f’). 

(c) f@), f'@), I" @)- 

(@) 0), $4), 70). 


(e) fir da, Te af (x) dx, ik f(x) dx. 


2. Calculate three-place values of the function f(z) = 1/(1 + 2) and of each of the 
parabolic approximations obtained in Prob. 1 at an interval of 0.1 over (0,1), and plot 
curves representing the errors in each approximation on a common graph. 

3. Determine that member y(z) of the set of all linear functions which best approxi- 
mates the function f(z) = x? over (0,1) in the sense that each of the following quan- 
tities is minimized: 


+ The term polynomial is to be used in its common restricted sense, to denote an 
expression of the form aor" fe ayant of Oy whore n is a nonnegative integer 
and the a's are constants, 
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(a) I Lia) — y(a)}? de. 


(b) [F0) — yOR + F@) — y+ FA) - yr. 
(c) ae \f(z) — y(a)|. 


1 
(d) f, a(1 — a)[fle) — y(a)} de. 


4. Determine ci, co, and c; in such a way that the formula 


1 
[2 wet) ae = ef(=1) + eof 0) + eof) 


yields an exact result when f(z) is 1, x, 2?, and 2%, and hence also when f(z) is any 
linear combination of those functions, for each of the following weighting functions : 


(a) w(x) = 1; (b) w(x) = V1 — 2; (c) w(x) = ey, 
V1i- 2 
Section 1.3 
6B. LetS =u tut-:-: +ue+R fork =0,1,.... By noticing that 


Un + Rn = Rar, Ungt + Rag = Ra, 


Hloduce that if Rn and Ry»_1 have opposite signs, then R, is smaller than un in magnitude 
und is of opposite sign, whereas if also Ry and Rn41 have opposite signs, then Ry is a 
wmaller than unsi in magnitude, and is of the same sign. (This is often iowa as 
Nteffensen’s error test.) 

6, Let Spr = 09 —01 toe — +++ +(—-1)F 4-1 fork =1,... , where all v’s are 
positive. _Assume also that v.41 < for all k, and that v,—> Oask— ©. Show that 
Nw is positive and increasing with k, but that S2, cannot exceed vp. Hence deduce 


(hat So tends to alimitask— ©. Show also that So: tends to the same limit, and 
hence that the series 


(—1)*, 
k=0 
then converges toalimit S. Finally, show that the truncation error F; is of the same 
alun ns the first neglected term and is smaller than that term in magnitude. (Notice 
(hut any finite number of terms not satisfying the stated requirements may be added 


0 the series initially, without impairing its convergence.) 
7, Suppose that the alternating series 


S=v-u+tm-—++: = y, (—1)Fux 
k=0 
sonverges. Show that the series 


doy + $000 — 1) — $01 — 02) +--+ doth bs (—1)*(0e — vasa) 
k=0 


sonverges to the same sum, 


‘— 


me 
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8. Use the transformation of Prob. 7 repeatedly to show that 


oo 
1 . 
tobe Dae 


sale 


fk 





™R 
i] 
e 
| 
tole 
+ 
Oa 








SY (-1F \ (—1) 
nate, @+ DE +2) -i+s >, k + 1)(k +2) +38) 
2 3 \ (—)* 
= e+ +8) REDEFDE +E +4) 


Show that the retention of five terms in the last sum given ensures that 0.69306 < 
S < 0.69330 or that S ~ 0.69318 with a maximum error of +12 units in the place of 
the fifth digit. About how many terms of the original series would be needed to 
ensure this accuracy? (The true value is S = log 2 = 0.69315.) 


9. If f(x) is a positive decreasing function of 2, and if i: f(x) dx exists for 


i 3) 
some K, show that » f(k) converges. Show also that 
1 


[pi@a< , sk) <p, f@) de. 


k=K 


How many terms of the series 
oo 


i 
> eri 


k=1 


would be required to determine the sum to four digits? 
10. By making appropriate use of the known results 


i ee tat, 
ae ar ke 945 
= k=1 k=1 


ead nN gl 
> ea b} e@ ea +ih\~ 
k=1 k=1 


correctly to four digits. 
11. The error function is defined by the relation 


evaluate the sum 


ill 





erf x = —= eis e* dt. 
It is known that erf*—- lasz— «. With the definitions 


Fi(z) = E edt, Fale) = [ * ot dt, 
there follows 


2 2 
eam = = ogg . 
orf @ e- Fi(w) = A ¥ - a(v) 
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Show that 


(1) 
n= PA Qk + DEY 


where the series converges for all z_ About how many terms would be required for 
live-digit accuracy when z = 0.2, 1, and 2? 


12. With the notation of Prob. 11, make use of repeated integration by parts to 
show that 





OU (re) 23 2 2, 2.27. ,_ 2 .— 
Cee ae ee tt Nes moe 
22 °°, 2 7 x 2 2— 72420 
+ (5-3 moi) 2 [re Ayn di, 
and hence that 
2. 22 2 2 2 
i = e727 = 73 =. ob Kowt # dat), Soe Sah © 9 SL Syria jo 
F(z) =e [2+§2 +5 F0b+ + ; moni) 1] + Ea), 
- wee ees eS snittn 
where Bate) = (7 3°5 moi) fe #72" dt, 


Mhow also that E,(z) is smaller in magnitude than the term following the last one 
fetuined in the coefficient of e~*’, and is of the same sign, that the relative error cannot 
wxcoed (2x)?"n!/(2n)!, and that the infinite series obtained when n— © converges 
for all x. 

18. With the notation of Prob. 11, show that 


e?F (x) = fe (tern) & 


aud, after successive integrations by parts (each followed by multiplication and 
‘livision by ¢in the integrand), deduce that 











1fl ee 1 31 1 3. 2n —1 1 
ix) =m ~-| - —=— = "(5 terse) aaa 
eh (x) AE 5x3 +3°OR8 et (=) ) aa 
1 3 es 1 

ona a 9 G3 : = ). 2[* om an | 

ail hence that 
eT 1 11 1 31 r(i.3. 2n—1 1 
ideale ce Vi ae oe Re Cea Ge 2°°°° 3 | 
+ E(x), 
pc aefi <3 2n e: 1 

Where E(x) = (-1) Ve 2 “9 as rt) [re oe ant wy 
Mhow also that the series is divergent but asymptotic (in the strict sense) and that the 
iyiineation error due to neglect of E,(z) is smaller than the last term retained and of 


Hppoeile sign, Obtain the best possible approximation when x = 2, and give numeri- 
fal bounds on the error. 

id, by making use of the known expansions of e*, cos x, sin z, and (1 — 2z)# in 
were of a, obtain the coefficients of powers of x through the fourth in the correspond- 
# expansions of the A sation functions: 


(a) of con a; (b) cone 3 (ce) (cos a)h; (d) etme, 
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15. Under the assumption that a given series 
y =a tartan? +--: (a1 ¥ 0) 


converges for sufficiently small values of z, and that 2 can be expanded in a series of 
powers of (y — ao)/ai which converges for y sufficiently near to ao, in the form 





a=utAw? + Auwit:-: (u -4=*), 


show that the leading coefficients in the inverted series can be determined from the 
relations 


aA2 — 2, 

ayAs = —2a.A2 — a3, 

aA, — —a2(A} + 2A3) = 3a3A2 — a4, 
Show also that the first n terms of the inverted series can be obtained by a sequence of 
n — 1 substitutions in the right-hand member of the relation 


op eat gt SE gh tee, 
ay, a1 
starting with 2 = u, and retaining only powers of wu not exceeding the (r + 1)th in 
the rth substitution. Illustrate both methods in obtaining the first four terms in the 
result of inverting the series e* = 1 +2 + gue pouee. 
16. It is required to determine the symmetrically placed pair of nonzero roots of the 


equation 
sinh x = ca, 


where c is a real constant such that c > 1. Show that, with the abbreviations 
s = 6(c — 1), t = 2’, the problem can be considered as that of inverting the series 


ba Bly 3! 
saitee+ pet gettccss 


and deduce the expansion 


at = 8 — fos? + sen8* — s7eo0s + 
Sections 1.4, 1.5 


17. Show that the number (2.46): is known within less than one unit in the place 
of its jifth significant digit if 2.46 is known only to be correctly rounded to three digits. 

18. Using only five-place tables of sin x and cos x, determine cos 0.10 — cos 0.12 
and tan 0.12 — tan 0.10 to four significant figures. 

19. Values of cos z are calculated from a five-place table of sin z, by use of the for- 
mula cos = (1 — sin? x). What can be said about the accuracy of the calculated 
values? 

20. If all coefficients in the definition 


(a) = 5.082412 + 0.11095 
f@) = 0759952 + 0.014915 


are rounded numbers, to how many significant figures is f(x) determinate when « is 


known only to round to 3.26? 
21. If f(x) = (sinh « — sin x)/(couh « = con #), determine /(0,1) to 10 significant 


figures, 
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22. Determine bounds on the degree of indeterminancy of each of the quantities 
tan! 4.017216, sin—! 0.986423, cos 18.4178, and cos 18417.8, under the assumption that 
the arguments are rounded values. To how many significant figures are the last two 
quantities determinate? 


Section 1.6 


23. If fi) and fo(x) are the frequency functions of «: and e:, respectively, where 
+, and ¢: are independent random variables, show that the distribution function of 


+ eis 
tyes filsfal ds dt = f ‘é Eas filtu — Ofa(t) at| du, 


and hence that the frequency function of €: + e2 is 


i fila — t)fo(t) dt. 


24. Use the result of Prob. 23 to show that, if e: and ez, are independent and are 
hormally distributed about zero means, with standard deviations o; and o2, then 
act a is also normally distributed about a zero mean, with standard deviation 
w = (of + .03)3. [Determine constants i, \2, and & such that 

(a—t)? @ 2 | (t— az)? 


oe o - »% 
and wet £ — at = V2 X20, making use of the fact that 


‘ee ev’ dy = Vr, 


in evaluating the integral defining the required frequency function.] 
46. Suppose that «, e, . . . are independent random variables with a common 
wiiform frequency function 
1 
fay-{1 Cas? 32) 
0 (otherwise), 


and denote the frequency function of e: + «+--+ + by f(x). Use the result 
af Prob, 23 to show that 


insta) = [7 fie — df at = [PP palo at 


I particular, deduce that f2(x) is a triangular function, 


l+e2 (-1s5250), 
fo(x) = 


1-2 (<s2)), 


(otherwise), 
anil that fa(@) is defined by the relations 
2@ +2)? (—3 S25 —9), 
ee et (-3 S29) 
fA@=\ig-o Gs2s, 
0 (otherwise). 


Winally, plot each of the functions fi, f2, and fs, and compare it graphically with the 
frequeney function corresponding to the normal distribution which has the same 
sandard deviation on = Yn/i2 (n = 1, 2, 3). 
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26. If the coefficients of the polynomial 
n 
sta) = Yau 
k=0 


are independently subject to random error distributions with mean value zero and with 
a common RMS value corms, whereas x is subject to an error distribution with RMS 


value nrus, show that the corresponding RMS error erms 1n f(a) is given approximately 
by 


gant? a ; 
eeus = ACT CRMs + If (x) Pnkas- 


27. Use the result of Prob. 26 to estimate the RMS error in the calculated value of 
f(z) = 1.4723 — 2.4822 + 2.212 — 1.65 
when z = 2.03, under the assumption that the values of x and the coefficients are 


known only to be rounded correctly to the three digits given. Within what limits is 
f(z) actually determinate in this case? Within what limits does its value lie with 


probability of about 0.9? 
28. If x1, v2, ..- » tr are each rounded to n decimal places, show that the corre- 


sponding RMS error in f(a1,%2, . - - ,tr) is approximated by 


enus ~ (0.29 X 107") Ld, (2 ig 
[,2, Ga) | <® 


and if r is not too small (say r > 3), then the odds are about 10 to 1 that the error 
in f does not exceed K units in the nth decimal place. 


Show also that if 


Section 1.7} 


29. If (a,b) = (—1,1), show that the conclusions of Theorems 1 and 2 do not hold 
for f(z) = 1/2, that those of Theorems 3 and 4 do not hold for f(z) = 1 — z#, and that 
those of Theorems 7 and 8 do not hold for f(z) = x and g(x) = 2°. Account for each 
of these situations. 

30. Assuming the fact that 





show that 
5 @<0), 
i om zs 0 (2 = 0), 
5 (a > 0). 


t The truth of the theorems stated in §1,7 may be assumed in the following problems. 
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‘Thus show that the conclusions of Theorems 9 and 10 do not hold for 


F(a,s) = Suze 
with (a,b) = (0, ). 

et If f(x) oer at n + 1 distinct points in the interval a < x Sb, and if 
f(x) is continuous for a S$ x S b, show that f™ (x) vanishes at least or inai 

82. Ifa, > Oforr = 1,2, .. : , n, show that ict coals 


n 
ao sint + a; sin 24+ +--+ + ansin nt = sin 6t Ay 


=1 
for some @ such that 1 < 6 <n. ‘ 
83. Show that 


i. dt & 1 
2 P+1 <3 = 79) 
and that 





ee 1 
[fee ey <qlow a + 2%) <a (« > 0). 


$4. Show that 


1 
ae (1 — 2%)f(x) dx = $f(8 


* panes in (—1,1), if f(x) is continuous in that interval. Also determine when 
le) = 2°. 
46. If F'(k) is defined by the integral 








a (b = 2), 
we the second law of the mean to show that 
Pk) = (—1)e (A-1-—8k-2—-8 +--+ Q— 
() = (=) a )  o@<e<y, 
ond deduce that 
1 
ELD < (CDH < — 


#6, If g(x) is continuous and f(x) possesses a continuous derivative, and if 
x 
o@) = [ne - 090 at, 


rr pr goatee nee d¢/dx. By making an appropriate change of variables in the 
fefinition of (x), obtain an alternative expression for d¢/dz when the h 
femarding f and g are interchanged. ee 
Ov, fy! = 2 sin y + 1227 and y(0) = y’(0) = 0, show that 


y(a) = at +2 rh (x — 8) sin y(s) ds, 


wl deduce that y(v) lies between at — x and x4 + a, 


nn 
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. o : y 
38. Determine the first three coefficients in the Biirmann series + 
sin 2 = ci(e? — 1) + cole? — 1)? + cs(e? — 1)? + °°" 


and use the result to determine approximately the Nia of a “ 
toe i expanded in a 8 

9. If y = ao + ait + aor? + and if x can e : : 

of - is for ; near do, use the Biirmann expansion to show that the leading coeffi 


cients in the expansion 
x = ci(y — do) + oxy — ao)? + es(y — Go)® +: 


are given by 


a2 2a3, = ods: 

=— = -—-— ec =— >; 
a1=-— C2 3 8 
a1 ay ay 


and verify that the results agree with those of Prob. 15. 





CHAPTER 2 


INTERPOLATION WITH DIVIDED DIFFERENCES 


2.1. Introduction. Any one who has had occasion to consult tables 
of mathematical functions is familiar with the method of linear inter- 
polation and probably has encountered situations in which this method 
of ‘‘reading between the lines of the table” has appeared to be unreliable. 
If more reliable interpolates are desired, it is clearly necessary to make 
use of more information than that consisting merely of tabulated values 
(ordinates) of a function, corresponding to two successive abscissas. 
Whereas that additional information could consist, for example, of known 
values of certain derivatives of the function at those two points, it is sup- 
posed in most of what follows (an exception is found in §8.2) that the 
interpolation process is to be based only on tabulated values of the func- 
(ion itself, with any further available information reserved for use in 
owlimating the error involved. 

‘There exist a number of interpolation formulas which have this prop- 
erty, most of which possess certain advantages in certain situations, but 
0 one of which is preferable to all others in all respects. Whereas cer- 
(nin of these formulas are expressed explicitly in terms of all the ordinates 
on which they depend (Chap. 3), most of them involve only one or two 
of the ordinates explicitly and express their dependence upon other ordi- 
ates only in terms of differences of ordinates and successive differences 
of differences. 

In the general case, when the abscissas are not necessarily equally 
spaced, the use of so-called divided differences is convenient. The prin- 
‘ipal purpose of this chapter is to define such differences and investigate 
vertain of their properties, to obtain a basic interpolation formula due to 
Newton (§2.5), from which most of the other formulas of the type 
(loweribed can be deduced, and to obtain expressions for the error term 
(42.6), Related methods of iterated linear interpolation (§2.7) and 
jiverse interpolation (§2.8) are also treated. 

4.2. Linear Interpolation. The assumption that a function f(x) is 
Approximately linear, in a certain range, is equivalent to the assumption 
that the ratio 

f@1) — fo) 
%— Lo 
35 


(2.2.1) 


a 
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is approximately independent of xo and 21 in that range. ae ratio ie 
called the first divided difference of f(x), relative to Zo and 2, and may be 
designated by flxo,va) :T 


flzozil = fen) — Sea), (2.2.2) 


It is clear that f[x1,20] = f[xo, a]. 
Thus the linear approximation may be expressed in the form 





f[xo,z] ~ f{xo,x], (2.2.3) 

which leads to the interpolation formula 
f(x) ~ f(wo) + @& — ro)f(r0,%1] (2.2.4) 
or fle) ~ flo) + 2—2U@) — fa] (2.2.4’) 


or, equivalently, to the formula 





f(z) ~ : [(a1 — )f(ao) — (to — x)f(a1)], (2.2.4’”) 


v1 — Xo 


which can also be expressed in the convenient determinantal form 


jest oS (2.2.5) 
2s Z1— Xo\f(t1) %1— 2% 

It may be noticed that (2.2.4) involves one ordinate and a ae 
difference, (2.2.4) one ordinate and an ordinary difference, and Sey 4 
involves the two ordinates directly. The last form (2.2.5) peace a 
well adapted to machine computation, since its evaluation invo fe e 
continuous operation of the formation of a cross product followed by 4 
division. 

tt is convenient to designate the linear function defined by the fet 
hand member of (2.2.4) by yo(x), the subscripts corresponding to _ © 
ordinates used in its formation. For symmetry of notation, it is desirable 
to write also 
yo(x) = flxo] = f(2o), (2.2.6) 
so that f[xo] is defined as the zeroth divided difference relative to Xo om 
is merely the value of f (x) at x = Xo, and yo(z) is the a en 
nomial of degree zero which agrees with f(x) at % = Zo. Wit is no 
tion, Eqs. (2.2.4) and (2.2.5) become 


f(x) = yorla) = fla) + @ — ao)f[xo,e1] (2.2.7) 
1 yo(x) Lo — 2) (2.2.8) 
ona youl) "7 1 — ®o yil@) 1-2 


t Various other notations are used, such As (rom), Stoe), and (a,1)- 
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These forms are given here principally to correspond to more general 
forms to be obtained in following sections. 

We see that the approximation f(x) ~ yo,1(z) is exact for all values of 
x if f(z) is indeed a linear function, of the form f(a) = Ao + Ai, and, 
further, that the approximation is exact at the points x = x and 2, for 
any function f(z). 

As a numerical example, the linear interpolation of sinh z for = 0.23, 
from tabulated five-place values for x» = 0.20 and x; = 0.30, may be 
arranged as follows: 

Xi f(s) u—-# 
0.20 | 0.20134 —0.03 
0.30 | 0.30452 0.07 


f(0.23) ~ sas ae 


Since the true five-place value is 0.23208, it is seen that linear interpola- 
(ion here affords only three-place accuracy. 

It is useful to notice that, since a linear interpolation merely effects a 
vertain weighted average of the two ordinates involved, the result of an 
interpolation involving two ordinates such as 13.6340 and 13.6393 can 
be considered as the sum of 13.6300 and the result of effecting the same 
interpolation on 40 and 93, with this result added to 13.6300 in units of 
its last place. 

l'urther, since the numerator and denominator of the ratio (2.2.5) are 
homogeneous in the abscissas, the entries z; and 2; — = in the computa- 
(ional array may be multiplied by any convenient common factor. In 
particular, the x’s in the preceding table could be replaced by 20 and 30, 
and the entries in the last column correspondingly by —3 and 7. This 
will be done in subsequent examples involving linear interpolation. 

Unless f(z) is linear, the secant slope f[x0,«1] will depend upon the 
abacissas zo and 2;. However, if f(x) were a second-degree polynomial, 
ihe secant-slope function f[z1,x] would itself be a linear function of z, for 
fixed 2. That is, the ratio 

flx1,¢2] — flxo,r1] 


XL. — Xo 





= 0.23229. 


would be independent of xo, 21, and xz. This ratio is called the second 
divided difference, relative to those three abscissas, and is designated here 
hy flxo,a1,29]: 

flx1,%2] — f[x0,x1], 


Le — Xo 





f[x0,21,22] = (2.2.9) 


In particular, since f[x1,20,c] = f[x0,71,2] (see §2.3), the difference between 
ihe two members of (2.2.3) can be expressed as 


S(xo,a] — flxo,ai) = flvo,a] — flair] = (a — a)flao,21,2], 


EC 
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so that the approximation (2.2.4) can be replaced by the identity 
f(x) = flxol + (% — xo)f[z0,t1] + (& — t0)(% — x1)f(x0,21,2]. (2.2.10) 


Thus the error committed in (2.2.7), by replacing f(x) by yor(zx), 3s 


given by 
E(x) = f(x) — youlx) = @& — xo) (a — x1) f[r0,%1,2]. (2.2.11) 


Whereas knowledge of flxo,21,2] is tantamount to knowledge of the exact 
interpolant f(x), the form (2.2.11) of the error is a special case of a more 
general form to be obtained, which (as will be shown) is frequently useful 
in obtaining an estimate of the error in an actual calculation. For any 
linear function f(x), the error term will indeed vanish identically, as may 
be verified directly. 

Before generalizing the result just obtained, it is desirable to define 
divided differences of all orders, and to investigate certain of their 
properties. 

2.3. Divided Differences. Divided differences of orders 0, 1,2, .. +; 
k are defined iteratively by the relations 


ftea) = fled, fleoge) = MLB, 
fl[mo, .- +; ax] _ fix, ons al — fee eee tri) (2.3.1) 





We notice that the first k — 1 arguments in the first term of the numera- 
tor are the same as the last k — 1 arguments in the second term and 
that the denominator is the difference between those arguments which 
are not in common to the two terms. It is clear from the definition that 
flo, . - + ,te] isa linear combination of the k + 1 ordinates f(%o), - + +» 
f(ax), with the coefficients depending upon the corresponding k +1 
abscissas. 

When k = 1, the divided difference obviously is a symmetric function 
of its arguments, that is, flx1,%0 = flxo,vi]. It is shown next that the 
same statement applies to divided differences of all orders. In order to 
establish this fact directly in the case of k = 2, we may write 





_ flvs,22] — flzoas) _ 1 [= — f(a) _ f(t) — feof 
flo,aa,e2] = V2 — Xo — L2q — Xo V2 — U1 %1 — Xo 
and the result can be put in the symmetric form 

ce TUE at 
flxo,a1,04] (ao = £1) (Xo a 2) 


fer) f(w2) 


+ Ga aetna) + Gra to) = 1) 
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This result suggests the truth of the more general relation 











f{2xo, Pay Xr] ee f (<0) f(a1) 
(to — #1) + + + (Go — ae) © (41 — ao) - = (%1 — a) 
Ga.) a (2.3.2) 
for any positive integer k, so that the coefficient of f(x;) is 
a? = 1 ; 
GQaiy 1G om) @=0,1,...,k), (2.8.3) 


a Zero factor (x; = x:) is to be omitted in the denominator 
n order to establish this conjecture by induction, suppose that it h 
en proved for k = r. If we recall the definition “ 


Neo; se Sj Bp dag pe ae 
0 +1] un (fli, . . . ,%eea] — flo, . . . ,tr]} (2.8.4) 
i, then follows that,for? = 1,2, .. 


bd aeanber ee cen by . , 7, the coefficient of f(x;)in the right- 





I 1 1 
Bors — ToL (es — 21) °° + Ce — B31) 7 (%_— %) - + - (x — 3 
1 


7 hg i) Se a) = art), (2.3.5) 


in necordance with (2.3.8) withk =r+1. When7z = 0 orr +1, onl 
one of the terms in the right-hand member of (2.3.4) involves the o Heed 
/(w,), and the respective coefficient also is easily seen to be in ean : 
with (2.3.3) with k =r+1. Thus, if (2.3.2) is valid for k =r pie 
valid also for k =r-+1. Since it has been established for k = 1 ( d 
h) » 2), it is therefore valid for any positive integer k, as was to be hom 
lt f ollows, from the symmetry of (2.3.2), that the order of the ae 
_ is irrelevant. Hence Slo, . . . ,t%] can be expressed as the dif- 
ferenc e between two divided differences of order k — 1, having any k — 1 
of their & arguments in common, divided by the difference bake th 
a’guments which are notin common. For example, there follows 





f[x0,21,%2,23] = Slvs,2,3) — flxo,21,29] 








v3 — Lo 
= f[x0,%2,%3] — flx1,22,23] = 
5 eI, at 


In those cases when two or more arguments in a divided difference 
jooome coincident, recourse must be had to appropriate limiting proc- 
won, Thus, for example, if we set 2; = « + ¢, there follows 


—— 
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flxs,c] = fle + & 2 ~ie+9- i) 


and, in the limit when e— 0, we have 


fle,2] = f'@), (2.3.6) 
if f(x) is differentiable. A similar argument shows that 
d 
Gg fl estat] = Flt, - +: Lk, L 5X), (2.3.7) 
if vo, - - + » Te are constants. If wi, U2, - + + » Un are differentiable func- 
tions of x, there follows also 
d duy 
Gq it Lee ela es Url = y, fle Lae kyla, os yUnjtly] 
y= 
and hence, by taking w= "°° = Un = x, we may deduce that 
d n times n+1 times 
Gg iit Shoe E) 3 ja] = nflxo, - + + Ter + yt]. (2.3.8) 


erentiation of (2.3.7) combined with the use of 
tablish the additional useful formula 


r+1 times 
flo, . » » ste 2] = rlf[ao, . + + phy) + 0 + 9 


Finally, by successive diff 
(2.3.8) at each step, we may es 
a 
dx" 


(2.3.9) 


y deduce that the result of allowing r + 1 arguments 


Tn particular, we ma 
ent is finite if the rth derivative of 


of a divided difference to become coincid 


f(x) is finite at the point of confluence. 
It is seen that f[%o, - - - yLK,] 18 continuous at x = if Z is not iden- 


tified with xo, 21, - - + » OF Vy and if f(x) is continuous at %. If f’(x) does 
not exist at 20, the function f[to, . + + Ke] generally will not tend to a 
finite limit as %— %o. Thus, for example, if f(z) = 4/x, there follows 
f[0,2] = 1/ 4/x, and this function naturally becomes infinite as «— 0. 
However, the product (a — xo)f[vo, - - - Ley] 18 identical with 


flx,x1, es rl — flxo,%1, opi Ke Cel 


and, since the function f[w,71, -- - ,t,] is continuous at xo if f(x) is con- 
tinuous there, it follows that the product will tend to zero at 2o if f(x) is 
Thus we may deduce that the product (« — Zo) °° 4 
(a — &x)f[xo, - te, tends to zero as x approaches any one of the k +1 
distinct points Lo, » + + 1 % if f(a) ws continuous at that point. 

It may be expected that the kth divided difference of a polynomial of 
degree n is a polynomial of degreen — kif k Sn, and is identically zero 


ifk >n. The proof follows easily from the fact that the first divided 


continuous at Lo. 
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difference of x”, 
ae 7 ee 
Z—% emer Ry 
is : aha of degree m — 1 in x, when m is a positive integer 
Pan wien: Interpolation. If the accuracy afforded by a linear 
e ion is inadequate, a generally more accurat 
upon the supposition that f() ma nea amet Aasegt 
y be approximated by a pol i 
second degree near the abscissa i is alate 
of the interpolate. This i 1 
assuming that, within a certain i Fee ern 
prescribed tolerance, the fi ivi 
difference f[x,2o] is a linear functi Seer ave 
ction of x for fixed x i 
ea a o or, equivalently, that 
ie second divided difference f[z,xo,21] is constant. The meet 


f[x,20,21] ~ f[x2,00,41] = f[x0,41,72] (2.4.1) 


then takes the form 


Di Li 
or, after another reduction, 


= f[x0,©1,X2] 


I(x) © Yyor2(x) = flo] + (x — 2xo)f[xo,21] + (% — xo)(% — 21) f[%0,%1,%2). 


; . (2.4.2) 
Since the difference between the two members of (2.4.1) is 
(x = X2)f[X,X1,€2,r], 
(he error in the approximation (2.4.2) is given by 
E(x) = (x — %0)(x — «1)(@ — x2) f[x0,01,x2,2]. (2.4.3) 


Irom this result, we ma. 
; y deduce that H(z) = 0 if f(x) is a pol i 
. mere hee or yay and that H = 0 when x = 2p A a Le eae 
) unction f(x). Thus yo,1,2(7) is a polyno . i: ich 
pee with f(x) when x = 2, 1, and - i cies aiaiain ia 
1 order to make use of (2.4.2), one may first form a difference table as 


follows: 
ao | f(ao) 
[ao,a1] 
a | f(a1) [€o,41,A2] 
[a1,a2] 
a2 f(a2) 


ombhieg ed is given by the difference between diagonally adjacent 
ies to its left, divided by the difference between the abscissas correspond- 


ing to the ordinates inte : , 
entry. intercepted by the diagonals passing through the calculated 


———— 
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Thus, for f(z) = sinh a, the following table may be formed, in illustra- 
tion, with the abscissas 0.0, 0.2, and 0.3: 


ay = 0.00 | 0.00000 
1.0067 
\ 
a; = 0.20 | 0.20134 0.08367 
s 
“1.0318 
as = 0.30 | 0.30452 


Suppose that only the given data are available and that the value of 
1(0.23) is to be interpolated. If we take x; = 4, the calculation from 
(2.4.2) is of the form 


f(0.23) ~ 0.00000 + (0.23)(1.0067) + (0.23) (0.03) (0.08367) 
= 0.00000 + 0.231541 + 0.000577 = 0.23212, 


with an associated error of —0.00009. (One extra place was carried 
through the intermediate calculation, with the final result rounded to 
five places.) 

By renumbering the z’s, the calculation can be rearranged in various 
ways. For example, since the argument of the interpolant 1s nearest di, 
it may be suggested that we take x) = a: and, say, %1 = @2 and 22 = dv. 
In this case, there follows 


f(0.23) ~ 0.20134 + (0.03) (1.0318) + (0.03) (— 0.07) (0.08367) 
=~ 0.20134 + 0.030954 — 0.000176 = 0.23212, 
with the same end result. 

The first calculation uses differences on the indicated forward diagonal 
starting from f(ao), the second uses differences on the indicated zigzag 
path starting from f(a:). By further renumbering, other paths also ter- 
minating with f[a0,a1,a2] could be selected, all of which would give exactly 
the same value of the interpolant if no intermediate round-off errors were 
present. ; ; 

The second path is the one which departs least from an imaginary 
horizontal line through the argument of the interpolant. Accordingly, 
the new information introduced at each stage of the calculation 1s that 
which may be expected to be most relevant to that interpolant, 80 that 
the rate of approach to the final value may be expected to be maximized 
at each step of the path. In addition, since the coefficients by which the 
successive divided differences are multiplied are smaller in magnitude 
along the preferred path, the effects of round-offs introduced in the cal- 
culation of those divided differences will be somewhat reduced. t 


{ In this connection, it should be mentioned that, if divided differences of rounded 
valuos (not divided rounded differences of true values) are used, if the results of the 
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If the value of f(0.10) were required, from the given data alone, the 
first path would be the preferred one from the preceding point of view and 
would lead to the calculation 


f(0.10) ~ 0.00000 + (0.10) (1.0067) + (0.10) (—0.10) (0.08367) 
= 0.00000 + 0.10067 — 0.000837 = 0.09983, 


whereas the true five-place value is 0.10017. Finally, to interpolate for 
/(0.27), a path along the backward diagonal starting with f(a2) is prefer- 
able. Hence we would set zo = G2, %1 = a, and x2 = a, and would 
obtain 


{(0.27) ~ 0.30452 + (—0.03) (1.0318) + (—0.03) (0.07) (0.08367) 
= 0.30452 — 0.030954 — 0.000176 = 0.27339, 


as compared with the true five-place value 0.27329. 

In the preceding calculations, and in similar ones, when the number 
of differences to be retained has been decided in advance, and when the 
end point of the path is also predetermined, the reduction in loss of 
accuracy afforded by the ‘preferred path” is usually of no great conse- 
uence and the rate of approach to the final value at intermediate stages 
\s irrelevant to the final result. Thus, the choice of paths is then rela- 
(ively unimportant. However, in the more involved cases when differ- 
ences of higher order are available, and when the point at which the path 
jn to be terminated is not preassigned, it is desirable to choose that path 
which, when terminated after any number of steps, may be expected to 
alford the best result obtainable with that number of steps. The preced- 
ii, examples were intended to illustrate such paths in simple cases. 

2.5. Newton’s Fundamental Formula. The identities (2.2.10) and 
(2.4.2) are special cases of a general formula, due to Newton, which may 
he derived as follows. 

I'rom the basic definition (2.3.1), there follows 


f(x) = flxo] + (« — 20)f[x0,2], 
flxo,x] = flxo,ri] + (& — 21) f[x0,41,2], 


Nee, . « « ,€n-1,%] = f[xo, . . . ta] + & — Sa)flzo, . « . An]. 


(2.5.1) 


l\y substituting the second relation in the first, one obtains (2.2.10), 


f(x) = flao] + (x — 2o)f[x0,21] + (@ — 20)(% — 41)f[X0,%1,2], 





‘iviaions do not require additional round-offs, and if all following calculations are 
effected without round-off, all paths which incorporate the same given data will lead 
to exactly the same end results. Thus the preferred path does not minimize the effects 
of inherent errors in the given data (as is sometimes argued), Those effects depend 
only upon the end point of the path and are considered in §3.2. 


ee 
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and, by successively substituting from subsequent relations in (2.5.1), 
there follows finally 


f(a) = flxo] + (x — xo)f[xo,r1] + (@ — %o)(@ — t1)f[x0,01,02] + °° 
+ (4 — 20) °° * (@ — n-)flto, . - - tn] + E(x), (2.5.2) 
where E(x) = (a — a) + + * (@ — an)f[xo, . - ,fn;%]- (2.5.3) 


The obvious details of the induction are omitted. 

The approximate relation obtained by suppressing the error term in 
(2.5.2) is known as Newton's interpolation formula with divided differences. 
The resultant right-hand member, which is clearly a polynomial of degree 
n, may be denoted by yo,...,n(«)- An inspection of the error term then 
shows that yo,.....(x) is identical with f(x) if f(x) isa polynomial of degree 
n or less, and that it agrees with f(x) at then + 1 pointst = %,... ny 
regardless of the form of f. Further, there exists no other polynomial 
Y(a) of degree n or less having this property, since, if this were the case, 
Y — y would be a polynomial of maximum degree n with n + 1 zeros. 
This situation is impossible unless Y — y vanishes identically. 

Thus, if f(x) is known at n + 1 distinct points do, ai, . . + An; where 
ay < 01 < *** < Gy, a variety of equivalent forms of the interpolation 
polynomial yo,...,»(«) of degree n (or less) which agrees with f(x) at these 
points can be obtained by identifying each of the z’s in (2.5.2) with one 
of the a’s. The various possible forms are not considered here in explicit 
detail. However, in Chap. 4 a more detailed consideration is given to 
the situation in which the abscissas a, . . . , dn are equally spaced, so 
that certain simplifications are possible, and convenient use can be made 
of available tables of certain coefficient functions. 

In illustration, we suppose that values of sinh x are given to five places 
for x = 0.0, 0.20, 0.30, and 0.50, and that sinh 0.23 is required by use of 
third-order interpolation. The calculation may be arranged as follows: 


az = 0.00 | 0.00000 
1.0067 
zo = 0.20 | 0.20134 0.08367 
1.0318 ay! 17333 
a1 = 0.30 | 0.30452 0.17033 
1.0829 
az = 0.50 | 0.52110 





f(0.23) ~ 0.20134 + (0.03) (1.0318) + (0.03) (—0.07) (0.08367) 
+ (0.08) (0,07) (0.23) (0.17333) 
& 1),20134 + 0.030954 = 0.000176 = 0.000084 & 0.28203, 
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The initial point x) was taken to be as near as possible to the argument 
of the interpolant, and the remaining abscissas were numbered in accord- 
ance with the indicated zigzag path of differences. The same end result, 
which is correct to the five places given, could also have been obtained by 
‘ny one of a number of other orderings of the abscissas. 

Once an appropriate continuous path of differences (made up of diag- 
onal segments, each sloping upward or downward to the right) has been 
selected, reference to (2.5.2) shows that the coefficient of the kth difference 
encountered ts the product of k factors, each of which represents the difference 
between the abscissa of the interpolant and the abscissa of an ordinate used in 
the formation of a difference previously encountered. t 

It is convenient to speak of the data lying inside and on the boundary 
of the triangular region, limited by the column of ordinates (zeroth differ- 
onces) in a difference table and the two diagonals passing through a 
specific difference in that table, as comprising the region of determination 
for that difference. It is then easily seen that the ordinates involved in the 
formation of any difference are exactly those ordinates which lie in its region 
af determination. Further, for a difference path of the sort considered 
here, the region of determination of the kth difference encountered 
includes the regions relevant to all differences previously encountered. 

These facts permit us to write down, by inspection, the coefficient of 
any difference encountered in a chosen path. For example, in order to 
obtain the coefficient of 0.08367 in the preceding calculation, we notice 
that the region of determination for the preceding difference in the path 
(1,0318) includes the ordinates corresponding to the abscissas 0.20 and 
0.40. Hence the desired coefficient is 


(0.23 — 0.20)(0.23 — 0.30) = —0.0021. 


4.6, Error Formulas. It was shown in the preceding section that, if 


f(x) is approximated by a polynomial y(x) = yo,...,.(x) of maximum 
(logree n, which coincides with it at the n + 1 distinct points x, . . . ,2n, 
then the error H(x) = f(x) — y(a) is given by 
E(x) = r(a)f[ao, . . . ,tn,2], (2.6.1) 

where w(x) is the polynomial of degree n + 1 defined as the product 

m(x) = (@ — ao)(@ — &1) + + + (@ — an). (2.6.2) 
‘This form of the error term will be particularly useful in considering the 
oouracy of formulas for numerical differentiation and integration in sub- 
fejuent chapters. 


| The instructions comprised in this statement are frequently referred to as Shep- 

ri's rules, Actually, the path of differences need not be continuous, but discon- 

HiouK paths are inconvenient and can be avoided, in any case, by reordering the 
Wray of given data, 
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sesses n + 1 continuous derivatives in the relevant 

i ich is often more 
interval, there exists another form of the remainder sae A ear 
useful in certain other considerations. In order to obtain it, 


first tha both f x x and TAX vanish at the n + 1 points t= Xo, 
wv eee Ln- ; 2 eae a linear combination of these functions, 
ly 9 ae 


However, if f(x) pos 


F(a) = f(x) — y@) — Kr), (2.6.3) 


at F(z) vanishes, not only 


d . th 
and determine the constant K in such a way ae caine peed 


i so at an arbitrari 
eal ee ae Since a(x) vanishes only at the n+1 
points considered previously, K eng e hose cence ee 

ae oer ae or Then F (e) vanishes at least n +2 
: on : Meg ars iateeeal I G By Rolle’s theorem (§1.7), F’(x) Lipson 
ae n + 1 times inside I, F’’() at least (aan : * Less 
finally, P+? (x) vanishes at least once inside J. Let 0 
denoted by &. There then follows, from (2.6.3), 


= fort (B) — yt? (® — Kalrt? (2). (2.6.4) 
‘ ; h 
But since y(a#) is a polynomial of maximum degree n, its (n + 1)t 


i i definition (2.6.2), there 
ivati ishes identically. Also, from the 5.2), 
os ene) =(n+1)!. Hence (2.6.4) yields the determination 


1 
(n + 1) 


and the relation F(Z) = 0 becomes 





K = form, 


f@) —¥® = Geyer Or: 





for some finI. If Zis identified with any one of Sac easrne ee od 
i i i ish, so that it is valid even in - 

h sides of this relation vanish, { : 
Se excluded case. Since is thus arbitrary, the bars may be sup 


pressed, and there follo Ws finall y 
E rv) = mY TX (2.6.5) 


for some ¢ in the interval J, where J is the interval limited by the largest, 


le 
llest of the numbers %o, 41, « - + » Un) % 
Bete neal guarantees merely that, for any given a, there ated ee 
i in J such that the error 1s expressib 
one corresponding number § in 


ph 2 6 1) and (2.6.5) must be equivalent, we thus obtain also the 


useful result 


INTERPOLATION WITH DIVIDED DIFFERENCES 47 


a rer wane: (2.6.6) 


for some argument £ in the interval J. This fact will be needed in later 
developments. 

In order to illustrate the application of the error formula (2.6.5), we 
consider the second-order interpolation (n = 2) for f(0.23) effected in §2.4. 
Under the assumption that the analytic expression for the interpolated 
function is known to be f(z) = sinh z, there follows also f’”’(x) = cosh x. 
Thus the error committed is given by 


E(0.23) = a (0.23 — 0.00) (0.23 — 0.20) (0.23 — 0.30) cosh £ 
= —0.0000805 cosh £, 


for some & such that 0 < — < 0.30. It happens in this case that cosh x 
inwy be computed at the tabular points from the given data, by use of the 
formula cosh « = (1 + sinh? x), and the range in cosh x over the given 
interval is thus found (without the need for additional data, but with use 


of the fact that cosh x increases throughout the interval) to be between 1 
and 1.04534. Thus there follows 


—0.0000842 < £(0.23) < —0.0000805, 


#0 that the error in the last place retained in the calculation should be 

8. Actually, the error was found to be —9 in the fifth place. The 
(liscrepancy is due, not to round-offs in calculation (which were suffi- 
tiently controlled by retention of a sixth digit, as may be verified), but 
to the fact that each of the original data possesses a round-off error which 
tiny be as large as 5 X 10-8. 

In other applications of interpolation, the analytic expression for F(a) 
may not be known, and hence it may be impossible to determine the 
hinge of possible values of f+»(£) in order to estimate the error E. In 
mich cases, the relation (2.6.1) may be more useful. For, if sufficient 
(lata are available to permit the evaluation of one or more sample values 
uf the (n + 1)th divided difference, these values may be taken as estimates 
uf the value of the divided difference which is actually relevant to (2.6.1). 
Thus, from the data obtained in §2.5, the divided difference 


f(0.00, 0.20, 0.30, 0.50] = 0.17333 


iy serve as an estimate of the required value f[0.00, 0.20, 0.30, 0.23], 
loading to the error estimate 


1(0.23) = 0.177(0.23) = (—0.00048) (0.17) = —0.00008. 


The fact that this estimate is indeed good in this case is a consequence 
im the fact that the third derivative, and hence also the third divided 





: : 
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difference, does not vary greatly in the range considered. It may be 
noticed that this error estimate is precisely the correction term which was 
involved in the calculation of §2.5 as a result of incorporating the con- 
tribution of the third difference. More generally, a consideration of 
(2.5.2) and (2.5.3) shows that if an interpolation for f(Z) is made, ter- 
minating with an nth difference, the error committed is given exactly by 
the product of the calculable number a(Z) and the (n + 1)th difference 
flwo, - + > Ln; 2], which is not calculable unless f(Z) is known. On the 
other hand, the first term omitted in a calculation based on (2.5.2) is the 
product of w(Z) and the (n + 1)th difference flto,. ++ intra} Tf 
fla, + ,2n,t] does not vary markedly over an interval including x = 
and x = Xn41, this first term omitted will indeed supply a good estimate of 
the error. This situation will exist, in particular, in consequence of 
(2.6.5), if for? (2) does not vary markedly over an interval I including 
L=y,+---: » En41, &. 

It may be noticed that, as 7 increases without limit, the length of the 
interval I, as well as that of the interval limited by 7 and an41, generally 
will also increase without limit, since the later abscissas introduced are 
generally more remote from 2, so that the uncertainty of this particular 
error estimate may be expected to increase. In fact, in many cases the 
result of omitting the error term in (2.5.2), and allowing n to become 
infinite, leads to an infinite interpolation series which is itself not convergent. 
That is, the error E(z) associated with retention of differences of order not 
greater than n very often does not tend to zero as 7 increases without 
limit. However, if the abscissas 20, 11, Z2,-- + are appropriately 
ordered, it is usually true that the magnitude of the error E first decreases 
fairly rapidly with increasing n, but then increases in magnitude as ” 
continues to increase. In most practical cases, the minimal error is 
extremely small, and the minimal stage occurs for a value of n so large 
that it is not actually encountered. 

In view of this situation, the error E(x) is not generally one which can 
be reduced in magnitude within an arbitrarily prescribed tolerance by 
increasing the number of differences retained. Thus, although this error 
is commonly known as the ‘¢truncation error,” it should be noticed again 
that this terminology often is somewhat misleading in that it would see 
to imply an error committed by truncating @ convergent infinite sequene 
of calculations after a finite number of steps. 

As in $1.3, we continue to define a truncation error as any error whic 
would be present even in the ideal case when the given data are exact an 
infinitely many decimal places are retained in the calculations, and wi 


+ The series obviously terminates if f(a) isa polynomial, and is a convergent infini 
sories in certain other cases, Some information with rogard to this question is give 
in $4.11, 
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shall i 

ane ii oat S as a paneer error in this general sense. The super 
round-off errors may be of “3 ; 

sas y be of equal or greater 

errs a be ae ye sitar is frequently that one ea. 

errors i 
ee ue to truncation and to round-off are of the 
2.7. Iterated Inte i 
: rpolation. In §2.2, it was J i 
polation can be conveniently effected by use of he wea ee 


1 Yr) %— 2 
ar oe ae (2.7.1) 


where yo(x) and : 
yi(x) are tw : . 
degree zero, P-independens sterpolaien: polyrnmiala af 





Yo,(x) = 


‘ose yo(x) = flo), = yr(a) = f(a). (2.7.2) 
e same way, quadratic interpolation can be effected by linear 


a 0. on over two independent linear interpolation ol nomials, so 


1 


Wo .2(a) aire 
v2 — Xo 


Youle) %1— 2 (72) 
In order to verif. 2 ; Yo,2(@) T2 — 2x ly 
y this fact directl : 
first right- yi WS MAY notice, for exampl 
* oa ear! of (2.7.3) is a polynomial of second ae 
Baciniy akes on the values f(a) and f(z2) when x = xo and a x 
tively, and that when x = 2; it correctly takes on hs pai = 22, 
1 f(a1) at — 41 
v2 — Xo f (x1) te — 2 aa f(x1). 


In a simi 

ee ae sep Pe may effect cubic interpolation by linear inter 

Wrath ioe av - ependent quadratic interpolation polynomials aad 

aaa ae ee onasiale is particularly useful for niabhine 

: 0, e advantage that it yield i 

he yields a sequ 

jolutes, from which the rate of effective convergencet es ae ae od. 
ed. 


In A itken § method 
bul ns iy the first four stages of the calculation would be 








Yori(t) 2-2 | eee 


Yi2(%) te—2] 22-24 














Zo | Yo | 

TT.) Yi Yor oe : 
T2 |} Y2 Yo2 Yo1.2 is Z ; 
Ts | Ys Yos Yours Your.2.3 ie —£ 
Le} Ys Yous Yours Youte2,4 Your,2.34 | Le — & 


Hore for exam y y would be obtair ed by evaluating the 
] ? ple, the entr 0,1,3 i 
thot orminant a ; 


Yo. 1 — = 
Yor C3 — & 


! 1 he phrase effective convergence 
e: . 
9 will be used In accordance with the generally 
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i be conveniently located in the above 
the meer ane ohare — a, Here an he 
ie a ded by the fact that this divisor can be obtaine as : e sa 
ce ee ° ie — £) between the entries in the right-han she ‘ 
oF bs Z sl beled aS io, - + +) tn may be arranged in any a S is 

es eal ie Yorn - independent of that arrangement ( oe na 
igs o e iannelinte round-offs). However, it 1s often desirable 
ee 


i nt % by Zo 
designate the abscissa nearest the argument of the interpola y Xo, 


r then the entries Yo, Yo,1, Y0,1,2, 
renee i ae eaten the best possible pat en 
open set data, which can be afforded by polynomial interpo ef 
ae eu oe ane two, and so forth. Also, each such aay 
cea ipnet nthe information used in the preceding estimate, toge 7 
Se bane iti | datum. ‘Thus the rate of effective convergence ca 
ee iy paneer estimated by considering the sequence of entries in 

e fairly 


the diagonal of the aneece considered in §2.5, the work could be 


the interpolation ) aia 
renee as follows, through the third-order calculation 


—3 

20 | 0.20134 : 

30 | 0.30452 0.232204 a 

0 | 0.00000 1541 0.232118 sa = 
50 0. 52110 3316 1936 0.282 


Jace would be consider- 
umed, whereas the fifth Pp | 
tae sei a ane to decrease the uncertainty, ee 
Py be ceeded If, for example, f(0.60) were mi aie able, 

oral row of entries would then be calculated, as follows: 


7 
60 | 0.63665 0.233988 0.231899 0.232034 0.232034 | 3 


BE “divit intel 
Thus the value 0.23203 appears to be stabilized as the five-digit inte 
us : 


ding to the given data. cd 
ae Se raierpolstion: It frequently iio ae ’ q 
: ‘. . . - e , 
is gi i or analytically) as a single-valu 
: “phe vo aa nae value of the independent phasic x jets 
re oe dependent variable y takes on @ prescribed value (freq 
for w "teh is the problem of tnverse interpolation. ne 
If g = f(®), then over any # interval oer ? in w ih pi aol a 
i ' nish, a unique inverse tune ion, say x = Fy), 
pee s ay seohe Thus, if dy/dx does not vanish near the ora 
Pm | increases 0 
i ion i sted (so that ym 
i interpolation is to be effec Sonne 
pokes ere the neighborhood of that point), it ay - cyt te ook 
satisfactorily approximated in that neighborhood by 


zero). 
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moderately low degree, so that the inverse interpolation may be effected 
by merely tabulating x as a function of y in that neighborhood, and using 
the preceding methods (or any other appropriate methods) of direct 
interpolation. 

In illustration, suppose that the following data are available and that 
the zero of y(x) between x = 1.3 and x = 1.4 is required. 


x} 1.1 1.2 1.3 1.4 1.5 





y | 0.769 0.472 0.103 —0.344 —0.875 


If Aitken’s method is used, with the entries ordered with respect to the 


nearness of an ordinate to zero, the calculations may be arranged as 
follows: 


y y—y 
103 | 1.3 103 
—344 | 1.4 1.32304 —344 
472 | 1.2 2791 1.32509 472 
769 | 1.1 3093 548 1.32447 769 
—875 | 1.5 2106 432 82 1.32463 | —875 


Thus, a fourth-order interpolation yields x ~ 1.3246, with its last place 
in doubt, although the uncertainty corresponding to the presence of 
round-off in the given data would also remain. Actually, the given data 
ire exact values corresponding to the algebraic relation y = —z' + 2+ 1, 
wd the problem can be considered as that of determining the real zero 
of the equation «* — x — 1 = 0, the true value of which is 1.32472, to 
live places. 

Nvidently, if this problem were stated in its analytic form, recourse 
lo & semianalytic method such as that of successive substitutions or 
the Newton-Raphson iteration (see §10.8) would also be appropriate. 
liven when the correspondence is given only in tabular form, it would 
ilno be possible to approximate the relation y = f(x) by the relation 
/ © Yo...n(@), Where the equation of the approximation is expressed in 
explicit polynomial form, with the help of Newton’s interpolation for- 
inula or of one of the other formulas to be obtained, and to solve the 
(multant approximating algebraic equation by such iterative methods. 
Ilowever, in order to estimate the accuracy obtained, it would be desir- 
ible to repeat the calculation for several values of n, each of which would 
loud to a distinct algebraic equation. 

I{ dy/dx vanishes near the point (%,7) where the inverse interpolation 
js to be effected, then the derivative of the inverse function becomes 
jifinite near that point, and a satisfactory approximation to the inverse 
fiinetion cannot be obtained by using a polynomial of low degree. In 
Mich & case, a simple iterative procedure is useful. For this purpose, 
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suppose first that two abscissas %_ and x, are available with the property 
that 7 lies between Ya = f(ta) and ys = f(a») (see Fig. 2.1). 
are sufficiently nearly equal, and if dy/dx # 0 in the interval between 
tq and a», linear inverse interpolation may then be used to obtain 
approximation to Z, say g0, Then, by direct interpolation, using the 
ordinates ya, Ys, and an appropriate number of other known ordinates, 
the true value f(z) may be approximated. Then, if that result is 
designated as g, linear inverse interpolation based on 7 and either Ya 
or y» (whichever one is separated from g™ by 9) is used to determine a 


second approximation to Z, say z, and the cycle of operations is repeated 


as often as necessary. 
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Fia. 2.1 


Methods of this sort, in which the only inverse interpolation involv 
is linear, and in which high-order interpolation is effected only on th 


direct function f(x), are particularly to be recommended in those cas 


when it is known that f(x) can be satisfactorily approximated by a pol 
nomial of reasonably low degree over an interval including Z, but wh 
it is difficult to be certain that the inverse function F(y), such th 
% = F(9g), also can be fairly approximated by a polynomial in y, of co 
parable degree, over the corresponding interval in y. Whereas situatio 
of this sort obviously are to be anticipated when f’ (x) vanishes for a r 
value of x near Z, they may also occur in the absence of such a warnin 

In critical cases, in particular in the case when dy/dx = 0 at the desir 
point, it is usually desirable to use one of the semianalytic methods me 
tioned earlier, in which f(x) is approximated by polynomial p(x) am 


the algebraic equation p(x) = 9 is solved 


\ method. 


by an appropriate iterati 





If ya and yo 


a first 
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2.9. 
Be pip acta cements Most of the texts on numerical anal- 
ysis ibliography deal with divided diff i 
the basic Newtonian inter i Dan ees 
_bas polation formula. Blanch [45] dis 
* . . . : . 
. ee of working with modified divided se sar of ae 
. ; we te Xn], where h is a suitably chosen S onabant: A method 
erated interpolation similar to that of Aitken [25] is due to Neville 


|167|. For references to additi 
t . : ' 
er itional methods of inverse interpolation, see 


Nection 2.2 eae 


1, Use (2.2.5) to calculate a i 
P pproximati = 
1.2000 from the following wictad data ‘ values of f(z) when z = 1.1416, 1.1600, and 


x | 1.1275 1.1503 1.1735 1.1972 





f(z) | 0.11971 0.13954 0.15932 0.17903 


4, Calculate the three first divi i 
a ivided differences relevant to su i i 
in Prob. 1, and use (2.2.4) to determine approximate values of feyioe abe: 


x = 1.1600(0.0020)1.1700. + 
8, Prove that isi i 
_ f[xo,x1] is independent of xo and a if and only if f(x) is a linear function 
4, If f(~) = u(x)v(a), show that 
f[xo,x1] = ulxolv[ro,a1] + ulxo,xi)o[x1]. 
6, If /’(x) is continuous for zo S x S 2x, show that 
f[xo21] = f'(é) 


fov #ome & between zo and a, and hence also that 


flx0,%0] = lim f{zxo,#1] = f’ (xo). 


Meotion 2.3 ae 
6, If the abscissas in P 
the rob. 1 are numb: in i i i 
pera Datei ered in increasing algebraic order, verify 
. Suppose th = = ) 
ppose that 2, = zo + rh (r = 1,2, . . .), so that the abscissas are at a uniform 


Spacing h. Show that (2.3.3) then becomes 
se MADE DL eee 
+ il(k — ithe Fk! (*): 


kN, A 
Whore (| ) is the binomial coefficient. Thus deduce that 
| 


ft ++ ee = en ¥ oe (") fle.) 





WW thin cane, 


/ lhe notation z= m(h)n denot x is to take on values be we =m 
08 that, 
it t; en x& and 


——~"E™"r—r—r—n——"O®”d’”rsnwstw‘“ti‘i 
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xo + rh, verify directly (from the definition) the truth of the 
= : \ 
lation established in Prob. 7: 
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g. Assuming that 2; 
following special cases of the re 


feos) = £ faa) — Soh 
1 ylea) — 2f a) + Jo) 


flxo,a1,22] = ath? 


L ipias) — 3f (a2) + 3f@1) — f@o)l- 


flxo,21,22%3] = Zips 


9. If f’(z) = df(x)/da, show that 


4 ste  S't20e1 


2) is linear. oe 
re t i (x) = u(a)o(x); show that 


the form 


the relation established in Prob. 4 gener 


flzo, . - - tn] = y ulzo, - + - ytelv[ee, + + + nl. 


k=0 
= N, showing that then 


Use induction, assuming the truth of the relation for n = 


N 


{(any1 — veulito, - + - ert, + + - en+1) 


flxa, oe tn] — flxo, wee an] = 4 


+ (rrp — zoulto, - = yCepilCery + + 


and that this expression properly reduces to 
N 


pores jcxlv[Le, a eke tnyi} 
(tn41 — %0) { uicdotes . etna] + wet 


+ ulso, --- srvesloewsal 


ne i nd difference f[a0,41,4 
sons of Prob. 2, making use of the second , 
‘c saa aie es Prob. 11 with those obtained by using the second di 
. Compa 
ence f[a1,42,a3] instead. 
13. Obtain the formula 


in f(a) de = (a1 — t0) fo) + 4 (a, — 20)*flx0,1] — L(x, — x0) %f{xors,22] + E, 
x = 


: at yy dy | , 
where E Lend { (a x xo) (% 7 a1) (x L2)J (Lox 2 9,0 dx 
0 data of i rob. 


the error term, to th aby. 


rderymm an ahs bom over each subinterval, 


ula 
14, Apply the form Jues of the integral of 


obtaining approximate va 
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obtaining also approximate values of the integral from the smallest abscissa to each 
of the others. Then use interpolation to obtain an approximate value of the integral 
over the range (1.14, 1.18). 


Section 2.5 


15. The following table lists the rounded value of the probability Q that the magni- 
tude of a normally distributed error, with mean value zero and standard deviation 
unity, exceed e, for certain values of «and Q. Calculate from it approximate values 
of Q for e« = 0.7, 0.9, 1.1, and 1.2. 


e| 0.4 0.5 0.6 0.8 1.0 1.25 





Q! 0.68916 0.61708 0.54851 0.42371 0.31731 0.21130 


16. Use the data of Prob. 15 to calculate approximate values of «for Q = 0.4, 0.5, 
and 0.6. 

17. Suppose that values of f(x), f’(x), and f’’ (x) are known for = 2o, values of f(x) 
and f’(x) for = a1, and the value of f(x) for z = x2. Show that the corresponding 
ilivided-difference table appears as follows, through third differences, where each 
ifference is formed from diagonally adjacent entries to its left by the usual rule, the 
values of the derivatives being entered in advance: 


Lo | f (xo) 

f’ (xo) 
%o | F(x) af” (to) 

F' (20) S[x0,r0,%0,21] 
#0 | f (x0) f[x0,r0,21] 

Slxo0,r1] F(x0,20,21,21] 
Z f(x.) f[zo,21,24] 

Ff’ (as) S (x0, 21,21,22] 
1 | f(x) flx1,01,22] 

flx1,72] 
ta | f(x2) 


Mhow also that Sheppard’s rule remains applicable to any “difference path” made up 
#f contiguous diagonal segments, and write down the formula which introduces suc- 
townively the values of f(xo), f’ (xo), f’” (xo), f(ax1), f’ (x1), and f(x). 

18, The following rounded values of Q(e) and its derivative Q’(e) are known. By 
appropriately modifying the procedure illustrated in Prob. 17, construct a suitable 
ilifference table and calculate approximate values of Q for e = 0.2(0.2)0.8. 


€ Q Q’ 





0.0 | 1.0000 -—0.7979 
0.5 | 0.6171 —0.7041 
1.0 | 0.3173 —0.4839 


19, Assuming that the third divided difference of f(a) is constant for all z, fill in the 
‘on in the following divided-difference table (from right to left), and hence evaluate 


EE ————— 
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f’(8) and f’’(8): 
x | f(z) 
0 3 
1 
1 4 4 
13 1 
3 30 10 
63 1 
6 | 219 17 
148 — 
8 | 515 —_— 
8 | 515 — 
8 | 515 


é . eS 
Alse use a similar procedure to obtain f’(3). Determine an analytic expression 


d check the results. : es a 
rn - fle), f(ae), and f(xs) are values of f(z) near a maximum or minimum point a 


x = &, obtain the approximation 


a1 + 22 flx.,r2] 


EO — Offrira,t0) 


and show that it can also be written in the more symmetrical form 


ay +202 +25 _ flere] + flees), 
lio aaa ai Aflac 2,03) 


Show also that, when the abscissas are equally spaced, it becomes 


h fs —hfi : 
B=t—- oR Oth 


where h is the common interval. 


Section 2.6 


21. Show that the truncation error asso 
ordinates at 20 and 2 with t 3 2% S x1, 


gMilxr — £0)’, 


f’’(z)| in the interval (20,21). Does this resul! 


ciated with linear interpolation of f(z), usin 
is not larger in magnitude than 


where M, is the maximum value of | 


f trapolation? ; : 
te aes ge Cerunption that the data in Prob. 1 correspond to the functio 


(x) = sin (log x), show that the truncation error corresponding to ee inipeaa 
Aedes jaepensie ordinates is smaller than one unit in the fourth decimal place. 
23. Show that the magnitude of the truncation error correspon: 


polation of the error function 
riz = 2. fe e~* dt, 
erf x Va Jo 
between zo and 2, cannot exceed 
(a1 — o)* 


2 Vane. 


and hence is amaller than (1) = o)*/8. 








ding to linear inter- 
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24. In the special case when the abscissas are equally spaced, with separation h, 
show that the magnitude of the truncation error corresponding to second-order inter- 
polation based on ordinates at xo, x1, and x2 does not exceed (M;h*)/(9 1/3), where 
M, is the maximum value of |f’’’(x)| over the interval (z0,z2)._ Show also that, on the 
average, the largest errors may be expected to occur at distances of about h/+/3 ~ 
0,58h from the central abscissa. (Translate the origin to the point z = 21.) 

25. Show that the magnitude of the truncation error associated with third-order 
interpolation based on ordinates at the equally spaced points 2p, 21, x2, and x; does not 
exceed (3M ,h*‘) /128 for interpolation between 2; and zz and is, on the average, largest 
at the center of that interval. Show also that it does not exceed (M,h*)/24 for inter- 
polation between zo and 2; or between x2 and z;, with a maximum to be expected, on 
the average, at a distance of about (3 — +/5)h/2 ~ 0.38h from zo or x3, where M, is 
the maximum value of |fi”(z)| in (xo,v4) in all cases. [Translate the origin to the mid- 
point (a1 + x2)/2.] 

26. Obtain the formula 


[(x) = (xo) + (x — xo)f'(ao) + (a — xo)*f[xo,20,21] 
+ (x — 20)2(x — x1) f[x0,20,01,21) + E(x) 


where E(x) = xy(x — x0)2(x — a1) *fiv(£) (to < 2,& <x), 
and show that |E(x)| Ss oe max |fiv(x)|. 
xoSxrSni 


a7. Iff(z) = 1/(@ + 1) and y(z) is the polynomial approximation of degree n which 


amrocs with f(z) when « = 0, 1, 2, . . . , n, show that the use of (2.6.5) leads to the 
error bound 


|E(x)| < |r —1)---+ @@ —n)| 
Whereas (2.6.1) permits the less conservative bound 
1 
|E(a)| < @+D! ljr(za@ —1)-++ (@—n)| 


When z 2 0. 

48, Suppose that a table presents values of f(x) rounded to r decimal places at a 
iform interval hin z, and that linear interpolation is employed for the calculation of 
/(®), Suppose also that the tabular abscissas are exact, that the abscissa £ is rounded 
to « decimal places, and that the calculated approximate value of f(Z) is rounded to ¢ 


tleelmal places. If f(z) is continuous over the tabular range, and if 6 is the total 
error in the resultant interpciate, show that 


|8| S @Moh? + 5M, X 10-1 +5 X 107-1 4 5 X 10-1, 


where M, and M, are the maximum values of |f’(z)| and |f’’(x)|, respectively, in the 
tabular range. 


90, The function f(z) = logio sin z is tabulated for x = 0.01(0.01)2.00 to five 


tlevimal places. If linear interpolation is employed, with the abscissa of the inter- 
7 rounded to five decimal places, and if the calculated result is also rounded to 


ve places, determine the portions of the table for which the results certainly will be 


eerreet within five units, and within fifty units, in the fifth place. What accuracy 
sould be guaranteed over those ranges if the abscissa of the interpolant were rounded 
te four places? To three places? 


80, A table of values of the function f(z) = («4 — z)/12 is to be constructed for 
& © G4, in such a way that the error in linear interpolation would not exceed « 





——— 
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if the effects of round-off were negligible. ae ate a . al 
i ller than 2 +/2¢/a and at least a e 
eT yee ee divided into the ranges (0,~) and 
ired. Show also that, if the range (0,a) were i , 
c desert uniform spacings h, and hs were used in those nares cea nee = 
at i ivisi he conditions a = 4/2, he = e/a, 
nt division would be such that t i : JV 
rae eee approximately satisfied, corresponding to & reduction of about 25 per 
pie . 
i umber of entries. 
eae an following table of rounded values of §@) = altar keisha . 
divided difference table and determine successive approximations to f aie hese 
é d five successive or : - 
ing to the use of one, two, three, four, an 
ae = 0. Gointare: these results with the true value. Suhopices a 
oxi men situation have been predicted (without direct calculation) ae ane 
oe of the analytical form of f (2)? What preliminary warning 18 afforded by 
ence to the difference table alone? 


x 0 1 2 3 . 





fa) 0.00000 0.31623 0.44721 0.54772 0.63246 


32. Form a divided-difference table based only on the aera of eweiest 
f(z) = 25 — 528 +a% + 4¢ — 2 at the points « = —2, as s an - 4 a 
interpolate from this table approximate values of f(z) at = —1.5, —0. : : sre 

e them with the true values. How could the possibility o eee 
eo anes predicted (without direct calculation) assuming knowledge of the ana- 
lytical form of f(x)? 


Section 2.7 


33. Use the Aitken procedure to determine Q(0.7) and (0.5) as accurately as pOs- 

i the data of Prob. 15. 
ar ote = yen procedure to determine f(0.20000) as accurately as possible from 
the following rounded values of f(x) = sin [sinh™ (a + 1)): 


z | 0.17520 0.25386 0.33565 0.42078 0.50946 








f(a) | 0.84147 0.86742 0.89121 0.91276 0.93204 


35. Deduce the validity of Aitken’s method by establishing the relations 


Boneescamal®) = Ween iM ys @ ag alle — afew «+: 
se a eames —m)-°°* @- Zm-1)(% — tn) fle, - + 


and eliminating f[zo, - - - ,Lm,Ln] between them. 


Section 2.8 


36. If y = f(x) and if f'(z) # 0 for ao < % <1, show that the truncation on 
linear inverse interpolation based on corresponding values (ao,yo) and (21,42) 38 


by 
“i t) 
=(y = yo(y = wm) oy 


where wo <& < ty Mts’ @)/As'@) oxinte and is continuous in that interval, Shi 
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also that the magnitude of this error is limited by each of the bounds 


(yr = yo)? Mvp A (yi 
8 K, 8 MK, 8 \m my, 


iff h=2a1—20 |f'(2)/[f'@!| SK, m S |f'@)| S$ Mi, and |f’(z)| S M2 for 
% Sz S21. 

87. Suppose that f(z) = z? is tabulated for 0 S$ x S 1 with a uniform spacing of 
hina. Assuming that sufficiently many significant figures are supplied and retained 
in the calculation to permit the neglect of the effects of round-off errors, determine a 
(as a function of A and e) so that the error of linear inverse interpolation will not exceed 


w specified quantity e over the range (a,1). What spacing would be required to assure 
an accuracy within 0.005 for 0.1 $ z S 1.0? 


88. Repeat the calculations of Prob. 37 when 


f(x) = ff sin #2 dt. 


{Use the inequality sin u > 2u/r (0 < wu < w/2) in bounding the error.] 
89. Given the following data, use the iterative process of inverse and direct inter- 


polation to determine, to four decimal places, the value of x between 1.50 and 1.60 for 
which f(z) = 0.99800: 


x | 1.40 1.50 1.60 1.70 1.80 


f(z) 





0.98545 0.99749 0.99957 0.99166 0.97385 





40. Calculate an approximation to the value of z required in Prob. 39 by approxi- 
muting f(z) by the parabola y(xz) which agrees with f(x) at the points for which 
» ~ 1.50, 1.60, and 1.70, and solving the quadratic equation y(x) = 0.99800. Then 
imo the iterative method of Prob. 39 to obtain an improved approximation which may 
lw expected to be correct to four decimal places. 

41. The following critical table for the function f(z) = a(@ — 1)(2¢ — 1)/12 has the 


joperty that, for any z between successive tabular abscissas, the corresponding value 
of /(v) rounds to the entry given for that range: 





x f(x) 

0.05667 

0.0040 
0.05844 

0.0041 
0.06025 

0.0042 
0.06208 

0.0043 
0.06394 


Construct the table, by first tabulating f(z) for appropriate convenient values of z 


svcd then using inverse interpolation to obtain a when f(z) = 0.00395(0.00010)0.00435, 
iy othorwise. 





CHAPTER 3 


LAGRANGIAN METHODS 


3.1. Introduction. For many purposes, it is desirable that a formula 
for interpolation, numerical differentiation, or numerical integration be 
expressed explicitly in terms of the ordinates involved, rather than in 
terms of their differences or divided differences. Such formulas permit 
a more direct consideration of the effect on the end result of a change or 
error in one or more of the ordinates, and their use does not require the 
calculation or tabulation of differences. However, it is found that these 
advantages are attained only at the sacrifice of others. 

The basic formula, apparently due to Waring, but associated with the 
name of Lagrange, is derived in §3.2, and its general use in interpolation, 
differentiation, and integration is illustrated in §§3.3 and 3.4. Certain 
specific formulas for numerical integration and differentiation are derived 
from it, in the cases when the abscissas are equally spaced, in the remain- 
ing sections of the chapter. 

3.2. Lagrange’s Interpolation Formula. Lagrange’s form of the poly- 
nomial y(x) = Yo,...n(®) of degree n, which takes on the same values as 
a given function f(x) for the n + 1 distinct abscissas Xo, T1, +: - 
differs from the Newtonian form derived in §2.5 in that the ordinat 
involved are displayed explicitly in the Lagrangian form, while the New: 
tonian form explicitly involves divided differences of those ordinates. 
Whereas it clearly must be possible to derive Lagrange’s form from (2.5.2), 
its importance justifies the indication of three alternative methods 0} 
approach, which are typical of methods also useful in other considerations. 

As a first approach, we could write y(x) in the form 


» Un; 


: Att, (3-201) 


k=0 


y(t) = Ast Awt-:: + Ast = 


where the A’s are to be determined in such a way that y(ai) = f(i) fo 
4=0,1,..-,” These requirements are represented by the n + 


linear equations 


Ao + Asto + Awit+::: + Ant} = (xo), 


«bate api Ay KNORR ey OU TS 8 2 OOF) 8” 


(3.2.2 
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If these equations are solved by use of determinants, the use of special 
properties of the determinants involved leads to rather simple ti r 

sions for the A’s in terms of the ordinates, and the introduction of ne 
results into (3.2.1) leads to the desired result (see Prob. 5). The re oie: 
ment that the A’s satisfy (3.2.1) and (3.2.2) can be expressed te the 


condition 
1a <r og 
f(mo) 1 2 x an 
i ty th Solin Ga csigie tose, GAS vet = 0, (3.2.3) 
f(&n) 1 Xn coe Kote 


the expanded form of which would i 
t also give the e i i 
interpolation polynomial y = Yo,....n(2). : pane lai 


Alternatively, we could write y(zx) directly in the required form 


(x) = lo(a)f (xo) + li(z)f(as) + °° + + 1(2)f(@n) = y L.(x)f (2x), 
k=0 


(8.2.4) 


. . , Un(z) are polynomials of di 

=a » egree 7 or less, to be deter- 
: mee by the requirement that the result of replacing y(x) by f(x) be — 
4 nitty when 4) is an arbitrary polynomial of degree n or less. It is 
lear that this situation will prevail if and only if the result of replacing 


where I(x), . 


y(v) by f(x) is an identity when f(x) = 1 7 
ie ee uy 
poquirements are represented by the n + 1 equations si 
l(a) +h(x) +--+: +h(@) = 41, 
Xolo(x) + X11 (x) eo Lnln() = wt, 
ee Ree 3.2.5 
xplo(x) + xrly(a) + SoG + xerln(x) = x", 
from which the coefficient functions can be determined directly as ratios 
of determinants which can be expanded in simple forms. The eliminant 


af the Eqs. (3.2.4) and (3.2.5) is merely the result of interchanging ro 

and columns in the array whose determinant appears in (3.2 3) : that 

{he equivalence of the final forms is indeed nantimed: ee 
Ituther than pursue either of these lines, we may avoid somewhat 

lengthy calculation by noticing that the expression (3.2.4) will i A od 

take on the value f(x;) when x = q; if 1;(x;) = 1 and if ue) = eke 

With the convenient notation of the so-called Keonoske delta " 

’ 


iy ae 0 if 7 ¥ J, 
i ifi=j. (3.2.6) 
this requirement becomes merely 
Li(ay) = by me 0, oii Le feregigiian Os.) ym). (3.2.7) 


——————— TT 
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Since 1,(z) is thus to be a polynomial of degree n which vanishes when In view of the equivalence of (3.2.12) and (2.5.2), the er: sted 
aly 0-4); ror committe 






















x = Lo, Li, + + +» Vi-ty Titty + + + Ons there must follow by replacing f(z) b i ee . 

one ath ) by y(x) is again given by either (2.6.1) or (2.6.5), so that 
L(x) = Ch(z — wo) + °° @— ai1)(@ — ten1) °° | @ — &a)]; (3.2.8) : 
where C;is a constant. The final requirement 1,(a;) = 1 then determines f(x) = , L.(x)f(a) + E(x) (3.2.14 
C; in the form k=0 d 2.14) 
1 where E(x) = w(x)flxo, . . . ,n,e] = r(x) fe) 

—_ 2. , wns 3.2.15 
C; (a; — ta) @e — Bea) (Bi — Tai) (a — rn) (3.2.9) (n + 1)! ( ) 


und where, as before, ¢ is some number i i 
; in the interval J limi 
largest and smallest of the numbers 2», 21, x a ie cs 
tha : ?  e ny . 
‘ lo illustrate the use of the Lagrangian formula, we may write down 
|he interpolation polynomial of degree three relevant to the data 


and the desired Lagrangian coefficient functions I(x) are obtained by 
introducing (3.2.9) into (3.2.8). 
In order to put this result in a somewhat more compact form, we first 


review the notation of (2.6.2): 


(a) = (@ — 2o)(@ — #1) + * (@ — Bn) (3.2.10) oii iets ae 





f(z) 111 —5 


Now the derivative of x(z) is clearly expressible as the sum of n + 1 terms, in the form 
in each of which one of the factors of r(x) is deleted. Thus, if we set 
= x; in this expression, we obtain the useful result ye: (x — 0)(z — 1)(@ — 2) (x + 1)(x — 1)(x — 2) 








Gl Orla) 0+ DOs 
— 10-2 
41. EHD = (0 ~ 2) te eee es 1) 
: (1 + 1)(1 — 0)(1 — 2) (2 + 1)(2 — 0)(2 — 1) 
wm —ga(x — 1)(@ — 2) + $(a + 1)(e@ — 1)(x — 2) 
— 3(x@ + 1)a(x — 2) — $(a@ + Yaz — 1) 


w(t) = (ti — 0) +++ (tA) = a (3.2.11) 








where the factor (x; — ai) is to be omitted in the product. Thus, after 
introducing (3.2.8) and (3.2.9) into (3.2.4), we obtain the Lagrangian . 
interpolation polynomial of degree n in the form Which may be reduced to 
, y= 8 a1. 
or the purpose of actual numerical int i 
erpolat: i : 
{inal form would not be necessary. polation, the reduction to this 


On the other hand, whereas the Newtoni 
onian meth i 
formation of the divided-difference table Boe emia 


n 


y(x) = 2 eae f@) = » le(a)f (xe), (3.2.12) 


k=0 


_G) 
L(2) = (2 — xi)’ (x;) 





—~1 1 
(% —%) °° ° Ge %-1) (4 — Sei) < F (x — Xn) 

mC os to) °° (x; — Xi-1) (Xi — Lint) °°! (a: — Zn) (3.2.18 0 
The first expression for 1;(x) is useful in theoretical considerations, t i ne Psa 
second in the actual calculation of the function. 0 —1, 

It should be noticed that the definitions of the functions (x) and L(@ 1 1 -3 

involve the degree n of the interpolation polynomial. Generally, in thi te 
sequel, the value of n will be clear from the context. When a mo re 


explicit notation is necessary, we may replace 1,(a) by Uin(x) and r(x) b 
Wn(X). 

The direct derivation of this result from the Newtonian form (2.5.2) 
of some academic interest and may be effected by making use of (2.3. 
and comparing (2.3.3) with (3,2.11), 


the use of the indicated difference : 
Miloulation: path would invalve only the following 


y= 1+ 2(0) + a(@ — 1)(0) + —1 _ 
=1—2a¢—1)(¢+1) = sated x ae Sake! 





eens ee ee ee ae 
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The Lagrange form of the interpolation formula f(z) ~ y(x) possesses 
the advantage that its use does not involve preliminary differencing of 
data. However, it has the disadvantage that, unless f(a) is given ana- 
lytically, so that use may be made of the second form of (3.2.15), it is 
difficult to estimate the truncation error relevant to the result afforded 
by interpolation based on a given number of ordinates, or to estimate the 
number of ordinates needed to reduce the truncation error below pre- 
scribed limits. If the Newtonian formula is used, a more or less depend- 
able estimate of accuracy, based essentially on the first form of (3.2.15), 
may be obtained by sampling the first neglected higher-order difference. 

Furthermore, in order to improve a certain result by taking into 
account one or more additional ordinates, the coefficient functions 1;(x) 
would have to be completely redetermined in the Lagrangian procedure, 
whereas the Newtonian procedure would require merely the formation 
of a higher-order difference, and the addition of a multiple of that differ- 
ence to the previously calculated result. 

On the other hand, the Lagrangian form is much better adapted to the 
analysis of the effects of inherent errors in the data. Thus, if the original 
data were all correctly rounded to r decimal places, so that the maximum 
error in each given ordinate is 5 X 10", it is seen that the largest pos- 
sible corresponding error in the interpolation for f(z) would be 


|R() |maz = (5 X 10-7?) y \2e(x)|. (3.2.16 
k=0 

The corresponding calculation based on the Newtonian form would 

more complicated but would, of course, lead to the same result. In ad 

tion to this error, the errors due to truncation and to intermediate roun 

offs must be taken into account in either case. 

3.3. Numerical Differentiation and Integration. Once an interpol 
tion polynomial y(a) has been determined so that it satisfactorily appro 
imates a given function f(x) over a certain interval J, it may be hoped th: 
the result of differentiating y(x), or of integrating it over an interv 
will also satisfactorily approximate the corresponding derivative or int 
gral of f(x). However, if we visualize a curve, representing an appro 


approximated, we may anticipate the fact that, even though the deviati 
between y(x) and f(x) be small throughout an interval, still the slopes 
the two curves representing them may differ quite appreciably. Fu. 
ther, it is seen that round-off errors (or errors of observation) of alterna 
ing sign in consecutive ordinates could affect the calculation of the deri 
ative quite strongly if those ordinates were fairly closely spaced. 

On the other hand, since integration is essentially a smoothing proce 
it would be anticipated that the error ‘associated with integration may 
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small even though the interpolation polynomial itself provides only a 
moderately good approximation to f(x). 

These expectations are borne out in practice. In particular, numer- 
ical differentiation should be avoided wherever possible particularly 
when the data are empirical and subject to appreciable aprate of observa- 
tion. When such a calculation must be made, it is desirable first to 
wmooth the data to a certain extent. Certain methods of effecting such 
4. smoothing are considered in §7.13. 

l'rom the Lagrangian approximation 


n 


fle) ~ Y le(a)f(an), (3.3.1) 
‘ . k=0 
With associated error 
~ Fees) 
we obtain the corresponding integral formula 
b n 
[Ls@ dx ~ , Cyf (re), (8.3.3) 
k=0 
where the weighting coefficients C, are given by 
b 
CG, = : I(x) dx (3.3.4) 
and where the associated error can be expressed in the form 
1 b 
E= +)! [ a(x) fF (£) dx. (3.3.5) 


With regard to (3.3.5), it should be remembered that £ is a specific, but 
wonerally unknown, function of x, so that even though the (n + 1th 
tlerivative of f were known analytically it generally would be impossible 
to evaluate the integral defining EH exactly. However, if it is known 
that, say, |ft(x)| S$ M in J, where J is limited by the largest and 


einullest of vo, 21, . . . ,%n, a, and b, and where M&M is a constant, 1 
tloduced that. aioe jee tees 
M b 
EH < +! i jr(a)| dx. (3.3.6) 
Vurther, in those cases where no one of the abscissas 2; lies in (a,b), the 
fiinetion r(x) does not change sign in (a,b) and the second law of the mean 
($1.7) may be invoked to show that 
_ forn(n) [2 
E= (m+ DD! “ a(x) dx, (3.3.7) 


Where » is some number in 7, This last situation exists, in particular, 


in the frequently occurring cases when the integration is carried out over 
the interval between two adjacent tabular points. 







——————<———— 


66 INTRODUCTION TO NUMERICAL ANALYSIS 


Similarly, by differentiating (3.3.1) r times, one obtains the approx- 
imation 


foe) = Yew) (3.3.8) 
Ea 
with the associated error 
1 ad’ me 3. 
E(x) = fj Pipl dx (r(x) fr? (é)]. (3.3.9) 


However, since the dependence of — upon z is — unknown, the differ- 
‘ation ici ffected. 
tiation in (3.3.9) cannot be explicitly e ; 
oe ick . obtain a somewhat more tractable form of the remainder, 
we replace (3.3.2) by the equivalent first form of (3.2.15), which involves 


the current variable z itself. The error (3.3.9) can then be expressed in 


the form z 
E(x) = 7; lal@)fleo - + - @n,t]} (3.3.10) 
If use is made of Leibnitz’ formula for the rth derivative of a product, 
(ws) =uDv+r Du prow +25) pw Dw +--+ + Duv 
= ) (‘) Diu D-», (3.3.11) 
a a 


where D = d/dz and where (‘) represents the binomial coefficient 


(") = iret) ae ead ar _ 7! 3.3.12) 


i a (F— aul 


Eq. (3.3.10) takes the form 


r 


E(x) = > (") w(x) a flzo, - + - Ent] 


+=0 





or, making use of (2.3.9), 


. r—it+1 times 


! 
E(x) = Y, 7 (a) flee, bho gone abies’ a). 
i=0 
A generalization of the relation (2.6.6) leads to the fact that 


2) = mite) 


m times 
flo, «+ + sny@y oe + 


where, for given n, & lies somewhere in the interval J limited by the larg- 
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est and smallest of zo, ..., 2%, and z. Hence, finally, (3.3.18) can be 
expressed in the form 


: ! 
E®(z) = > qe eo rO(g)for—40(£) (3.8.15) 
i=0 


where each of the r + 1 numbers &, . . . , & lies in J. 

The expression for the error is thus rather complicated in the general 
case, and when the rth derivative is calculated by differentiating an inter- 
polation polynomial of nth degree, the estimation of the error may involve 
the estimation of derivatives of f(x) of ordersn +1,n+2,...,n+7, 
and n + r + 1 in the interval J. 

It may be noticed that when r > n the right-hand member of (3.3.8) 
vanishes identically, since J,(x) is a polynomial of degree n. Generally, 
at best only derivatives of order r for which r is small relative to n are 
given with any significant accuracy by this formula. 

In the case r = 1, the formula (3.3.8) becomes 


n 


fa) ~ Y hese), (3.3.16) 


k=0 


and the associated error, as given by (3.3.15), is of the form 








E'(2) = x'(z) as meee a (3.3.17) 


where both £, and & lie in the interval 7. In particular, for numerical 
(lifferentiation at a tabular point, there follows 


n 





f(a) =) Ueedflar) + x'(a) LEY, (3.3.18) 
4 k\Aa k 4 (n + 1)! oO. 
k=0 
since w(x) vanishes when x = 2;, where the factor 7’(x,;) has the simple 
form 
w(x) = (x; a Xo) uns (x; md Ln); (3.3.19) 


i necordance with (3.2.11). 
It is seen that this relation is the result which would be obtained by 
(ifferentiating the formula 


n 


foot (z) 
f(a) = ), wena) + (x) @rD! 





——— 
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with respect to 2, overlooking the fact that £ is @ Sane gt dieu 
i i to a new parameter £1. 
x = 2; in the result, and changing & Ree ae 
i ivati bular point, (3.3.15) shov 
ulation of the first derivative at a ta ! 
ee would not generally yield the correct expression for the 


term. ; 
eee be shown, however, that the error E™ (x) can indeed be expressed 


in the analogous form 


fom) 


E(x) = 1 (2) GF! (3.3.20) 


for any positive integer 7, where 7, is somewhere in I, when x ae ae or 
at one end of the range of the tabular values to, - - +» In (see : af J 
3.4. Calculation. Since the coefficient function 1,(z) can be exp 


in the form 





; det see (x _ En) 
(e — me =m) @ = eG — ta) Ee, 
L(x) = (a; £)Cx — mo ° (4 — 2-1) (4% — ®i41) a ie 


it is seen that the form of 1,(z) is invariant under any linear change 


variables 


z=aths, zr, = at hsi, (3.4.2) 


where a and h are constants: 


(s — 80)(s — si) (s — sia)(s — 8izi) 


- (8 — Sn) . 
M(@) = T= anya = ai) BANG Ba)” 


: (s; - Sn) 
(3.4.3) 





It is often desirable to choose a and h in such a way le re ise 
i i distance from @ in units ‘ 
less variable s, which measures dis! i ene 
i any specific interp 
ient values at the tabular points used 1n 
ais spaced abscissas, his conveniently apa ye ee ae 
i i i that the following data 
Thus, as a simple illustration, suppose \ 
and that approximate values of f(x) at intervals of 0.01 are to be obtain 
by Lagrange three-point interpolation: 


gz | 1.10 1.20 1.50 


1.35 





eee 
0.8912 0.9320 0.9757 0.9975 

= 1.10 and 1.35, the transformation 
xz = 1.20 + 0.18 
is useful, the variable s then taking on the values — i Raph i a: bh 
tabular points, and the values —0.9, —0.8, «+ +» 1.8, 1. 


of interpolation. 
With the abbreviation 


y(a) (1.20 + 0,18) Vy 


f(z) 


For interpolation between z 
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the interpolation formula f(x) ~ y(x) then becomes 


$120 As) a ees (0.8912) 


(s + 1.0)(s — 1.5) (s + 1.0)s 

+ (1.0)(—1.5) (0.9320) + (2.5)(1.5) (0.9757) 

= 7'5[26.736s(s — 1.5) — 46.600(s + 1.0)(s — 1.5) 
+ 19.514s(s + 1.0)], 


& common denominator being extracted for final division, in order to 
avoid all intermediate round-off. In the range between 1.35 and 1.50, 
. new polynomial may be obtained, based on the ordinates at 1.20, 1.35, 
and 1.50, with, say, « = 1.835 + 0.1s. A comparison of a few values 
calculated between 1.20 and 1.35 by use of the new formula with those 
calculated by the preceding formula would give an estimate of the 
accuracy obtained. 


If approximate values of f’(x) are required at the same points, the 
formula 





f'(1.20 + 0.18) ~ yf = 4$(—0.7s 4+ 2.71) = (2.71 — 0.78) 
can be obtained by differentiating y, with respect to s and multiplying by 
the factor ds/dx = 10. Finally, if approximate values of the function 


* f(x) dx 


1.10 


F(z) = 
are required, say, at intervals of 0.01 in w, use can be made of the formula 
1,20 +0.18 8 
f 10 f(x) dx ~ Y,=0.1 fiiguds 
= gs00(— 0.75? + 8.138? + 419.48 + 410.57). 


I) the derivation of each of these formulas, the work was so arranged that 
fio round-offs were committed. 


‘The approximate values at x = 1.30 may be obtained, for example, in 
#orrespondence with s = 1.0, as 


f(1.30) = y. = qs (72.26) = 0.96347, 
f(1.30) ~ yf = (2.01) = 0.26800, 
and nt f(z) de ~ Y1 = zay9 (837.4) = 0.18609, 


to five places (not more than four of which would be expected to be 
Muenificant). 


Ii happens that the given data are four-place values of f(x) = sin z. 


The approximations obtained may thus be considered as corresponding 


ain 1.80 & 0.96356, cos 1.30 = 0.26750, and 
cos 1,10 — cos 1,30 = 0.18610, 


———— OO 
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: a 

In the cases when the abscissas are uniformly spaced, se meaner : 

cient functions have been tabulated rather extensive y fo gi 
a of n. Formulas involving an odd number of aeunee bad) ss 
ees used, and, if that number isn + 1 = 2m + ea abscii : 

1, ©0, V1, - © © 9 me 
‘onally renumbered a8 Tm, + + + 1 Pts V0 ae ‘ 

si seremcetr nin spacing e41 — Te is denoted by h, and if s is measure 
from the central point, so that 


x= 2 ths, x, = to + hs, (3.4.4) 
Eq. (3.4.3) then reduces to 


ee i ~¢-1) +: Go mtH6-—™) 
—---(s-tt)e—+t é ae 
Gam 25 = DA)(—N(-2) «°° & m + 1)(i — m) 





= L,(s). = Ga) wt 8) (3.4.5) 
Thus Ly(s) = (m!)? 


and . 
(—1)#'s(s + 1) 
Lis) = Ga aim — 0)! 





——7W2 \(¢4+12-—s%)-°° 2 — 8%] 
jaone—-8) GP -Oett si (3.4.6) 


foré = +1, +2, .... £m 


ffi- 
In illustration, Table 3.1 presents exact values of ore p 
cients for three-point (quadratic) interpolation to tenths, 


i i is included in §4.12): 
ing tom = 1 (a corresponding five-point table is inc 
TaBLeE 3.1 
L_i(s) Lo(s) Ii(s) 


a 





| 














0 1 0 0.0 
ne —0.045 0.99 0.055 a 
0.2 —0.08 0.96 0.12 ee 
0.3 —0.105 0.91 0.195 aa 
0.4 —0.12 0.84 0.28 ae 
0.5 —0.125 0.75 0.375 ans 
0.6 —0.12 0.64 0.48 ae 
0.7 —0.105 0.51 0.595 Soh 
0.8 —0.08 0.36 0.72 eee 
0.9 —0.045 0.19 0.855 By 
1.0 0 0 1 - oe 

Iy(8) Lo(s) L-1(s) 8 


This explains the f 

6) it follows that L(—s) = Ls): ' 
ping Rese values of a, to be read from the Bre ocr margin, t 
column Inbels at the foot of the table are to be used, 
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71 
Thus, for example, to interpolate the data 


z | 1.00 1.10 1.20 1.30 





f(x) | 0.8415 0.8912 0.9320 0.9636 


for f(1.24) by use of a three-point formula, the work would be centered 
at the nearest tabular point, « = 1.20. With s = 0.04/0.10 = 0.4, and 
with coefficients read from the preceding table, there would follow 


f(1.24) ~ (—0.12) (0.8912) + (0.84) (0.9320) + (0.28) (0.9636) 
= 0.945744 = 0.9457. 


‘lo interpolate for 2 = 1.02, the work would be centered at x = 1.10 
With s = —0.8, there would follow 


f(1.02) ~ (0.72) (0.8415) + (0.36) (0.8912) + (—0.08) (0.9320) 
= 0.852152 = 0.8522. 


‘The given data correspond to rounded values of f(x) = sin x, and the 
results correspond to the tabulated five-place values sin 1.24 + 0.94578 
and sin 1.02 = 0.85211. 

Iixtensive tables of Lagrange coefficient functions, and of certain of 


their derivatives, may be found in the literature (see Appendix B). 


3.5. Newton-Cotes Integration Formulas. In order to obtain formulas 


for the approximate evaluation of an integral of the form i : f(x) da, 


where a and b are finite, we may first introduce the change of variables 











ge aan (3.5.1) 
n 
where n is an integer, to obtain the relation 
b b —a n 
i f(a) dx = [ F(s) ds, (3.5.2) 
a 0 
b—a 
where F(s) =flat rate (3.5.3) 


If now it is assumed that f(x) can be approximated over (a,b) by the 
polynomial which agrees with it at, say, n + 1 equally spaced points in 


(u,b) we may obtain the approximate formula 
i, F(s) ds = Y CPF (he), (3.5.4) 
k=0 





me | SG a1) ++ —k+1(s—k—1) --- (8—2) 
ere ci” SS ry ey ee 
(3.5.5) 


——— CC 
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In accordance with (3.3.5), the error term omitted on the right in 


(3.5.4) can be expressed in the form 


= 1 nm 

= oe oe ei (n+) 
En te ay TH)! i s(s — 1) (s — n)F (£1) ds, (3.5.6) 
where 0 < 1 < ”. Since the coefficient of Ft» is not of constant sign 
in (0,n), the second law of the mean cannot be applied directly. How- 
ever, it is possible to prove (see Steffensen [18]) that, when is odd, the 
error can be expressed in the form which would be obtained if this pro- 


cedure were valid, 

= (n+) 7 

E, = rh y s(is— 1) °°" (s — n) ds (n odd) (3.5.7a) 
whereas, when 7 is even, the error can be express 
oe ae) at -3) ee 
a= (n +2)! Jo 8—5 s(s — 1) (s — n) ds (neven), (3.5.7b) 


where 0 < & < nin each case. 
If, as before, we write h = (b — a)/nand a 
lished can be put in the more explicit form 


ed in the form 


=a +t hi, the result estab- 


* f(a) de ~ h Y Cersen) (3.5.8) 
k=0 


where C(” is defined by (3.5.5). By noticing that 
d’ : d’ 
f Ps) = GI@ 


and that, from (3.5.2), the error in (3.5.8) is hE,, we obtain also th 





expressions 
_ prt2fm+h (£) n 7 oe 7 : 
En=~@ED! Jo s(s — 1) (s—n)ds — (n odd) (3.5.9a) 
and 
p22 Game ee 


where 20 < § < Un in each case. 
In illustration, we consider the case n = 2. Here there follows, fro 


(3.5.5), 
2(s — 1)(8 — 2) g, 2 }, ee eee 


Com J. I=) 8 Tiel) 38 
A ae—- ly. 1 
Os i (aay) “73 








(3.5.90 
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and (3.5.96) gives 
2 h®fiv(&) 2 . 
E:= “54 J, 8¢ — D%s — 2) ds = ~ re. 


The corresponding f : 
forth g formula (3.5.8), with the error term, then takes the 


x2 _h he 
[ f(x) dt = 3 (fo + 4h: + fr) — go fr (8) (to < E < 22). 


This is the celebrated formula of Simpson’s rule 
In a similar way, the following formulas may be obtained: 


_— 


ri h 3 
d. = _ hi uw 
iz f(x) dx 5 (fo + fi) io (é), (3.5.10) 


—_ 


" f(a) d ah 0 _ hh iv 
_ x) dx 3 (f + 4fi + fa) — 965 f° (2); (3.5.11) 


—~ 


(a) de = 2 BAF 
7 x) de = = (fo + 8fi + 3f2 + fr) — gq S*), (3.5.12) 


f(x) dx = 2h 7 Sh? 
© = Fe (fo + 32h + 12f2 + 82fs + Ths) — gge f"(6), (8.5.18) 


" _ Bh 
Sx) de = se (Mo + T5fs + 50f2 + 50fs + 75fe + 19fs) 


no 


~ €"—XC. 


275h’' .., 
— 190967 (8): (3.5.14) 





: h 
4 I(x) dx 140 (41fo + 216f1 + 27f2 + 272f3; + 27f, + 216f; + 41fc) 


Oh as 
— Fag0 f"(),  (8.5.15) 


_ 7h 
[ Ila) de = =O. (Tlf + B577fi + 132Bf2 + 2989s + 2989f, 
| + 1328f. + B577fe + 75Ui) — prev f(g), (8.5.16) 
| “f(a) dz = —*"_ (989 
J 14175 fo + 5888f, — 928f2 + 10496f; — 4540f. 
+ 10496fs — 928f5 + 5888/; + 989s) — rE f<(#). (3.5.17) 


An inspection of the error terms reveals that a formula involving an 


TT od 2m + 1 of oints would y eld exact resu. ti 1 f x 
Ss f 


timber n ++ 1 = 2m of points would 
be exact only if f(a) were lyno- 
ul of degree n or less, ‘Thus the two formulas involving 2m il a “ 1 


————— OO 
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ordinates have the same order of accuracy, so that generally no great 
advantage is gained by advancing from a formula involving an odd num- 
ber of ordinates to one involving one more ordinate. In particular, the 
error in Simpson’s rule (3.5.11) is given by —hfi(é1)/90, and that in 
Newton’s rule (3.5.12) is given by —3h*fiv(£2)/80, where both ¢, and £ are 
in (a,b). In comparing these errors, when both formulas are applied to 
the evaluation of the same integral, we must notice that h = (b — a)/2in 
the former case, whereas h = (b — a)/3 in the latter. Hence the coeffi- 
cient of —(b — a)” is ses in Simpson’s rule and sist in Newton’s rule. 
Thus the latter (which involves one extra ordinate) may be expected to 
be only slightly more accurate than the former, on the average. Clearly, 
the advantage may be shifted in either direction if fix(z) varies strongly 
over (a,b), so that fv(&:) and f*(&2) may differ appreciably, or if fi*(x) 


fails to exist or is discontinuous somewhere in (a,b), so that the error for- 
mulas are invalid. 

Another useful set of integration formulas is obtained by dividing the 
interval (a,b), a8 before, into n equal parts by inserting n — 1 equally 
spaced interior abscissas, then approximating f(x) by the polynomial of 
degree n — 2 which coincides with f(x) at the n — 1 interior points, and 
approximating the relevant integral by integrating the resultant poly- 
nomial over (a,b). These formulas thus do not involve the ordinates at 
the ends of the interval and are said to be of open type, whereas those 
previously considered are said to be of closed type. ‘The first few such 
formulas (n = 3, . - - » 6) may be expressed as follows: 


e fw) dx = se (fr + fa) + ar 7", (3.5.18) 
* sa) ae = F Oh — fa + 0) + MI (8), 3.5.19) 
i f(a) de = De fs + fa + Sa + Uf + oh Fr, (3.5.20) 


E f(x) dx = as (11fs — W4fe + 26f2 — 14fa + Lfs) + on. 


The formulas of the type considered in this section are generally known 
as the Newton-Cotes formulas. Those of open type are principally of use 


in the numerical integration of differential equations. 


Since all the integral formulas of the type considered in this chapter 


must, in particular, be exact if f(x) is a constant, it follows that the sum 


the weighting coefficients in any formula must equal the length of the interval 


Thus, for example, that sum in (3.5.13) is eph 90 = 4h =b—a. 
4 In place of using & single polynomial to approximate f(w) over the co 


plete range (a,b), it is clearly possible to divide (a,b) into subranges and 







(3.5.21) 
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: approximate f(x) by a different polynomial over each subrange. Thus 
sf example, by applying the two-point formula (3.5.10) to auicbesetine 
subranges of length h, one obtains the so-called trapezoidal rule: 


b 
f(z) dx =hGfoththt--: 1 : 
[ 2) Gfotfithat + fos ther + of) mer" ®, 


(3.5.22) 


where fo = f(a), fe = f(a + kh), and f, = f(b), and where now é is som 

where in (a,b). This formula corresponds to replacing the graph of f( ") 

by the result of joining the ends of adjacent ordinates by line se ont 

and is of remarkable simplicity. Whereas it is not of high he ae 
? 


may notice that, since h = (b — a)/n, th i 
sar fies sae )/n, the magnitude of the error can be 


b — a)’ 

ie, = CO pay, 
Hence, if only f’’(x) is contin i 
will indeed ah : ns like ee re peer a aad 
As will be seen, the accuracy afforded by a k-point Newton-Cotes f 
mula does not necessarily increase as k increases, and, in fact, the See 
may become worse and worse after a certain stage aon though i@) 
possess continuous derivatives of all orders for all real values ire a 
even though no round-off errors be introduced. In such cases ey 
tho desired accuracy is attained before this stage is attained, the use ri 
“omposite rule such as the trapezoidal rule (or the parabolic ‘call n ‘ . 
lo considered) is essential, as well as convenient. Bos 
Another advantage of the trapezoidal rule consists in the fact that th 
weighting coefficients are nearly equal to each other. For it is ae 
soon that, if n + 1 ordinates are each liable to random errors of obser 7 
Von (or round-off), the RMS error in a linear combination of ihe : 
ordinates, for which the sum of the constants of combination is fixed 
(hore equal to b — a), is least when the constants of combination are sual 
Nowton-Cotes formulas of the open type are particularly ahicetionable 
on tae pre of view, since, for n = 4 and n 2 6, their coefficients 
wotually fluctuate in sign. Simi i i i 
__ipaigbiegtie Pia oa sign fluctuations also occur in for- 
by dividing the interval (a,b) into n/2 subranges of length 2h, where 


, y an even integer, and applying Simpson’s rule to each subrange [that 
#, by approximating the graph of f(z) by a parabola in each subrange] 
the so-called parabolic rule is obtained in the form 


b 
h 
Moke = 5 (fot Afi + 2f2 + dfs + ++ + 4f,_3 + Wns 


+ ortho — TI, (8.5.28), 


ily 
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where again fy = f(a), fe = f(a + kh), and fn = f(b). Here, if f*(@) is 
continuous in (a,b), the error associated with the use of n + 1 ordinates 
tends to zero like 1/nt as n—> ~. Thus the parabolic rule is usually 
more accurate than the trapezoidal rule, when n is sufficiently large, unless 
the function f displays an unusual behavior in (a,b). Since also its 
weighting coefficients are simple and do not fluctuate unduly in magni- 
tude, it is perhaps the most widely used of all formulas for numerical 
integration. However, it can be used only when (a,b) is divided into an 
even number of intervals of length h, and its use may require a prohib- 
itively large number of ordinates if a high degree of accuracy is required. 

Other integration formulas are considered in Chaps. 5 and 8. 

3.6. Use of Integration Formulas. In order to illustrate the preceding 
formulas in a simple case, we consider first the numerical evaluation of 
the integral 





1 
dx 
= log 2 = 0.69314718 --- . 
[ ne 


With f(z) = 1/(1 + 2), there follows also 


—1)'k! 
5 (0) = 


and hence 2-*—k! < (—1)*f(a) < k! (0<2 <1). 
Thus, for example, if use were to be made of the five-point formul, 
(3.5.13), with h = 0.25, the upper and lower bounds 


0.000002 < —E; < 0.0004 


would be available with regard to the truncation error. Since f® (x) i 
positive in (0,1) when k is even, it follows that each error term will b 
negative. 4 
The following table of upper bounds on the magnitude of the possibl 
error relevant to the Newton-Cotes (N-C), trapezoidal (T), and parabo 
(P) rules, in the present case, is easily determined: 








Ordinates N-C T P 
2 2x107! 2x 107! oa 
3 9x10-? 5X10 9X 1075 
4 4xX10-* 2x 10 Exe 
5 4X 10-# 2X 107? 6 X 10-4 
6 SSCL TUX 0 ra 
*f 3 X1079.6.X%.10-* | 2 xX 10-* 
8 2x10-5 4X 1078 — 
9 2x10-* 838X107 4x 107° 


In particular, it can be predicted that the error involved in the use 
the parabolic rule with n + 1 ordinates (where n is even) will be betw 
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1/(240n‘) and 2/(15n‘) in magnitude, and that involved in the use of the 
trapezoidal rule will be between 1/(48n?) and 1/(6n?). 

The preliminary upper bounds may be quite conservative, but it is 
difficult to obtain more precise ones. 

In addition to the truncation errors, one must consider the effect of 
round-off in the values of the ordinates used in the calculation. If each 
ordinate is rounded correctly to r decimal places, so that the maximum 
error in each ordinate is not greater than 5 X 10-—!, the maximum 
corresponding error in the final calculation is therefore not greater than 
§ X 10"! times the sum of the absolute values of the relevant weighting 
voefficients. If those coefficients are all positive, this last sum must equal 
the length of the range of integration (here unity). Thus, in the present 
(use, if all weighting coefficients are positive, the error in the final result, 
(lue to inaccuracies in the original data, cannot exceed the maximum of 
those inaccuracies. This situation prevails in all the formulas consid- 
ered in the preceding tabulation except the Newton-Cotes nine-point for- 
mula, in which a magnification factor of $3442 ~ 1.5 would be involved. 
Whereas a considerable amount of cancellation in the errors of round-offs 
would be expected (particularly if the weighting coefficients are nearly 
#qual), it cannot be guaranteed in any particular case. 

Suppose that the ordinates used in the present calculation are to be 
founded to r decimal places, and that the final result is to be in error by 
lows than one unit in the rth decimal place. If the parabolic rule is to be 


\isod, with n + 1 ordinates, an even integer n must then be determined 
much that 


— <5 xX 10771! or n > 0.72 X 1074, 


Tho total error, due to truncation and initial round-off, then could not 
exeoed 10-7, under the assumption that no intermediate round-offs are 
lected, For r = 4, this condition gives n > 7.2, so that nine ordinates 
would be required; for r = 5, thirteen ordinates would suffice. If the 
\Wapezoidal rule were used, the need for about 58 ordinates would be 
jwedicted for four-place accuracy. Reference to the preceding table 


shows that a Newton-Cotes formula using nine ordinates would lead to a 


fmult in error by less than 2  10-° due to truncation. If the ordinates 
were rounded to five places, the effect of that round-off here could be as 


larwo as 8 X 10%, Thus the final error could not exceed 10-5. 


Actual calculation, with the ordinates rounded to five places, shows 


that the error associated with Simpson’s rule (three ordinates) is smaller 
than 2 X 10-*, with the five-ordinate Newton-Cotes formula less than 


* 10~', with the five-ordinate trapezoidal rule less than 4 X 10-%, and 
th the five-ordinate parabolic rule less than 10-*, The fact that some 
the error predictions were quite conservative is a consequence of the 
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variation of the higher derivatives of f(x) over (0,1). Thus, for example, 
the error estimate for the five-ordinate Newton-Cotes formula assigned 
the maximum value 720 to f(z) in (0,1), whereas all values from 4° to 
720 are admissible. A value of about 56 would have given the proper 
estimate. 

In those cases where f(x) is given empirically or, more generally, in 
such a form that information with regard to bounds on higher derivatives 
of f(x) is not readily accessible, less dependable error estimates may be 
based on the calculation of one or more divided differences of order equal 
to that of the derivative involved in the error estimate. 

Another method of estimating the error is of frequent usefulness, and 
it may be illustrated in the case of the parabolic rule. Suppose that two 
calculations are made, the first with ni -+ 1 ordinates and the second with 
nz + 1 ordinates, where n; and m2 are even integers. Let the resultant 
approximations be designated by [1 and Is, and suppose that the true 
value is J. Then, if only truncation errors are considered, there follows 


from (3.5.23) 


ll 


b — a)®,, 
I, _ Om ie 


(b — a)® 
18074 


I 


T=Ih- 





\l 


fiv(&), 


where £; and £ are in (a,b). If it is assumed that fi(&) ~ fiv(E2), th 
unknown fourth derivative can be eliminated to give the approximat 
extrapolation formula} 





_ nbd, — Mh _ nt _ 
be ni — nt Std aa Ih). (3.6.1 


In particular, if m2 = 2ny, there follows 
I = Ts + as(L2 = Ty) 


This approximation will certainly be valid if f® (x) does not vary rapidl 
and does not change sign, over (a,b), and generally may be used wi 
some confidence in any case, if the correction to be added to J2is small r 
ative to I. itself and if successive approximations appear to be approac 
ing a limit from one side, without oscillating about it. 

In the case of the preceding example, the approximation I =~ 0.6944 
was obtained with n = 2 and the approximation I =~ 0.693254 


(ne = 2n1). (3.6. 


+A procedure of this general type, in which two calculations are made, with er 
of the respective forms (m)/ni and (na) /nj, where #(n) is an incompletely kno 
function of n, and in which an oxtrapolation ton = @ is made under the assump 
that #(n) is nearly independent of nr, is ofton known os Richardson's deferred app 


to the limit, Bee Richardson and Gaunt (105), 























Hnlto limit as n— ©, 
fonventional, 
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obtai i = 
ained with n = 4. Use of the extrapolation formula (3.6.2) gives‘ 
I = 0.693254 — 0.000079 = 0.693175. | 


Th i ; 
F oo : I, would be estimated as about —0.00008, and the val 
ti pees : correct within perhaps one or bie Ghee 
) s indeed the case. Wh j : 
of su : : ; . . en oscillation of th 
ees a approximations is present, this procedure may b oaaleiale 
‘A gee able, as will be illustrated in the next section ae ea 
-7. Asymptotic Behavior of Newto 
: n-Cotes Formul i ; 
eee is results which relate to the choice bees nee er 
sae : aera ee over an entire range of n + 1 eee ee 
| -order fo : Saas ’ 
digs rmulas over successive subdivisions, when 7 is 


The rho ee 
problem consists in examining the behavior of the error term 
’ 


profit "s(s—1)--- a 
| +2f(n+1)(£) i a " ive ld ds (n odd), (3.7.1) 


ee hn+3fin+2) (g) i 7 s(s —1)--- = 
o\ ) (n + 2) . ™ ds (neven), 


where n + 1 is the numbe ; (3.7.2) 
; r of ordinat ‘ : 
range of integration, say (a,b). es and £ is somewhere in the relevant 








7 y 
the error terms given explicitl in §3.5, the numerical factor represented 


by the integral in (3.7.1) 

; i .7.1) or (3.7.2) de i 

bees creases slowly as 

‘Spee ane = shown (see Prob. 37) that, when n 4 hans dee 

(8.7.2) by one. . a ey prosimated by 2/ta(log w)4l, and that in 

3.7. -ha at quantity. Thus, in eith ' ‘cal 

In tor ultimately tends to zero somewhat ons ra rales eae 

rapidly than 1/n?. pute Geo 
On th i 

ke ic Aah anne can a seus by a Taylor series in the 
eh x = a, and if the radius of ¢ 

worics is R, then |f(a)| generally is of the order of aaeirads - a7 - 


when k is large. Thus, if 
“Searearee , if f(x) can be so represented at each point of 
IE, _ o| n! h n+2 
dog n)®\R (n odd), (3.7.3) 
and |E,| = 0 [oe h n+3 
Cogn)? \R (n even), (3.7.4) 


where PR is the radius of 
; converge 1 
ies tiie eis (0, gence of the Taylor-series expansion of 


) " , 
| The notation f(m) = Olg(n)] is used to indicate that the ratio f(n)/g(n) tend! 
: n 
If that ratio tends to unily, the notation f(n) . oh 


————————— 
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If the spacing h were held fixed and n were increased, the expressions in 
brackets Ww ould increase without limit, sO long as R remained finite, and 
hence the error associated with the numerical integr ation would also 


£ 
lt: mately tend to increase unboundedly in magnitude. This limitin: 
ultl 


roces: owever corres on¢ 1s t oa low l t he in te b to incr ease 
p Cc ’ ’ 


i i : . . . . +3 
- ee the interval (a,b) is fixed and is divided into n equal parts, 
ore ; 


ituting this relation into 
— (b —a)/n. The result of substitu rel : 
G bao ny 4) eh be simplified if use is made of the Stirling appro 


imation to the factorial, ai 
nl ~ V2an ne (n— ©), (3.7. 


to the desired result in the form 


80 


and a simple calculation leads 


\Ea| = 0 | aegear (3) | (3.7.6) 


i(log n)? 
Whereas the factor n'Uc 
vergence, nevertheless, if it happens that 





R< 


é 


the factor [(b — a)/(eR)]" d to increase unboundedly as 7 incre 


: i Itimately ten : . iently small. 
ake a sey decrease with increasing % when 1 : sereer : a 
ue be recalled that, when f(z) is an agen oes Coa 

i ree the radius of convergence of the Taylor 
varia , 


x = ais the distance, in the complex plane, 


i ity of f(z). i 
= a pein ie preceding section, the only singularity of 
n 


f R 
i = —1, and the smallest value of fy 
. x =1/(1 42) isatz -1 wane 
aeneen (0,1) is accordingly unity, enrreaponcing, ae 
eat Thus the ratio (b — a)/ (eR) here cannot are ae 
oo he difficulty does not arise. The fact that relative y ee 
pe : Sie in that case, to supply a nee spars P 
cc a a consequence of the relative nearness 0 e sing 
wev 

"Neae singularities at complex points a Ce) 
some as those which occur for real values Of &. 
fact, we consider the integral 


4 de _ 9 tan-14 & 2.6516. (3.7 
we 1+ a 


In order to illustrate t 





os ++ Vv hen # 
Here, although f(a) @ 1/G x") is perfectly well behaved when 
i) 


real, it possessor singularities at the complex points «= ti. 













in the denominator acts in favor of con= 


bak (b — a), (3.7.7) 


will dominate the denominator and the erro! 


between the point x = @ ane 


f course, just as troubli 


Th 
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here the minimum value of # is unity, corresponding to the central point 
x = 0, and (b — a)/(eR) can be as large as 8/e = 2.94, so that an ulti- 
mate increase in error with increasing n is to be expected if the successive 
values of £ involved in (3.7.1) and (8.7.2) are near the center of the inter- 
val, as might be anticipated from the symmetry of the integrand. 
Direct calculation indicates that this undesirable situation does indeed 
exist. The results of computations involving n + 1 = 8, 5, 7, 9, and 11 
ordinates, and using Newton-Cotes formulas over the entire range in each 
case, are compared in the following table with the results afforded by the 


same ordinates with the use of the parabolic rule and of the trapezoidal 
rule: 


n+1 N-C P T 
3 5.490 5.490 4.2385 
5 2.278 2.478 2.918 
7 3.3829 2.908 2.701 
9 1.941 2.573 2.659 
11 3.596 2.695 2.6511 


It is seen that the best of the Newton-Cotes approximations corre- 
sponds to the use of only five ordinates, and that the errors associated 
with successive formulas of higher order oscillate with increasing ampli- 
(ude about the true value. 

‘he sequence of approximations afforded by the parabolic rule dis- 
plays damped oscillations but is, of course, convergent. On the other 
hand, the trapezoidal-rule sequence is converging toward the true value 
al a rate which has not yet been exceeded by that of the parabolic-rule 
fequence, although the incorporation of additional ordinates would 
#ventually reverse the advantage. 

It is of importance to notice that the use of the extrapolation formula 
(4,6.1) is undependable here, because of the oscillation. Thus, whereas 
it wives a good prediction with n; = 4 and ne = 8, the prediction based 
oi ny = 6 and nz = 8 is worse than either of the approximations upon 
Which it is based. 

‘he preceding example is not intended to generally discredit the 
Newton-Cotes formulas which use many ordinates, but to serve as a 


warning that there exist many nonpathological situations in which their 
ie is not appropriate. Such situations generally can be predicted in 
wlvance when f(x) is given analytically. In particular, if the Taylor- 
series expansions of f(x) converge everywhere [as for e~*’, Jo(x), and so 
forth], the difficulty noted does not arise. However, if only tabular 
Values of f(@) are available, such a prediction cannot be made, and the 
te of a more dependable formula, such as the parabolic rule (which 


olde quartic convergence, if sufficient data are available, if only f‘ is 
Hiiinuous) or the trapezoidal rule (which requires only continuity of 
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i i data are 
f” for quadratic convergence), may be desirable. Also, if the 


: : ‘eu 
empirical, and of doubtful reliability, the trapezoidal rule is often to be 


referred although here it is generall still better to smooth the data 
p ’ 


i ation (see §7.13). 2 
ete ted should be noticed that relations such - Sin oe ; 
- itude when n is sujucten 
7.6) give only orders of magnit : a 
a pt of quantitative significance in actual vara eal 
38 Differentiation Formulas. To conclude this ¢ - e eae 
‘fo ical differentiation 0: 
las which may be used for numerica ' 
ee at tabular points, when the need for such a calculation canno 
un : 


be avoided. 

By differentiating three- 
mulas and evaluating the re 
derivative formulas may be 
the ordinates. 


and five-point Lagrangian interpolation for- 
sults at tabular points (see §3.4), the ssaaiey 
obtained, with a convenient renumbering 0 


Three-point Formulas: 


wee 3.8.1) 
fred -tat he -M + gf" ( 


4 - . aes (3.8.2) 
: oe ay, (FO + fi) a (§), 


0 


Ne ie 3.8.3) 
fade (fa— Mot 8) + 31": G8 


Five-point Formulas: 


h4 
‘,= nS (—25f. ot+ ASf_1 = 36fo + 16f1 = 3fe) + = FC), (3.8.4 
fla = 12h 7 


1 — 10f_1 + 18fo — fr + f2) — a (8), 
“= 7g le 1 


h* 
pb fg — fa t Bi — A) + HPO 


ll 


12h a 
h= ak (3f_2 — 16f_1 + 36fo — 48f1 + af) + er). Cm 
12 


In each set of formulas, each & lies between the eae luce a 
abscissas involved in that f awe tay scepter mein 
a cate eaten hrs worirers then not involved in the pelea 
we a aie of these formulas reveals the existence of a new pro 7 
i o genet For example, consider (3.8.2), and suppose ee 
prapeiah \s'?"(e)| < My, in the interval (ao — h, to + h), Then, i 





h! | 
p= Je (fa + Oia — 18fo + 1H + 3) — 39 re), (8 
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given data were exact, the maximum possible error in the calculation of 
J’(ao) would be 

M 3h? 


|E'3| max = 7 





On the other hand, suppose that each of the ordinates involved could be 
in error by +e. Then the magnitude of the corresponding error in the 
calculation of f’(2o) could be as large as 


| Rs|max a 7 

Whereas a reduction of the truncation error FE; would generally require a 
decrease in h, a small value of h would lead to a large possible round-off 
orror R3 and, conversely, a reduction in | R3|max Would generally correspond 
to an increase in |E3| ax. 

A reasonable procedure consists in determining the interval A such that 
the predictable upper bounds on the two errors are about equal, if this is 


feasible. The optimum value of h and the corresponding maximum total 
error 7’; are then found to be 


haope © 1.88M;-4, [Palma © LEM 3. 


Corresponding results relevant to (3.8.6), and to similar seven-point 
find nine-point formulas, can be obtained as follows: 


Rsvopt © 2.1e®M 5-5, |T's|max ~~ 1.4€*M?, 
Riopt © 2.2eM7*, | T7\ max ~ 1.72M;, 
hoon ~ 22Ms, — |Tohmx ~ 1.966MsI. 


In illustration, suppose that empirical values were to be obtained for 
4 function which is truly of the form f(z) = sin wz, and that one of these 
formulas were to be used to approximate f’(0) = w. In this case, the 
felovant quantities M}* are each equal tow. Thus, if, say, the maximum 
observational error ¢ is 0.01, the optimum spacings for the three-, five-, 
Mevon-, and nine-point formulas are found to be about 0.39/w, 0.84/w, 
1.14/w, and 1.32/w, respectively, and the corresponding maximum total 
errors in the calculation of f’(0) = w are found to be about 0.051la, 
H.0N5w, 0.0330, and 0.032w, respectively. The increase of hop with 
iierensing n, and the fact that an increase in n affords only slight improve- 
Ment in accuracy, are both worthy of note. 

‘The results of this example are typical of most practical situations in 
ich the function f(x) is representable by a Taylor series which con- 
tyon for all values of a, When the series representations have finite 
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3. F . 5 
If y(x) is the polynomial of degree n which agrees with f(z) at the distinct points 


r= 2%, 21,---, In, and if r(x) = ( 
, rt) = (@—m)@—m)-- + @—s obtai 
agrangian form of y(x) by determining the coefficients in the ae ee 


sion of the ratio 
y(a) _ > at 
a(x) 2 — ak 
k=0 


(Multiply both members b 
y x — z, and 
4. Show that as ic 


radii of convergence, the quantities My" tend to increase with increasing 
k, and the incorporation of additional ordinates may lead to a decrease in 
accuracy, at an early stage, when the inaccuracies in the given data are 
appreciable. 

In practice, unless f(z) is given analytically, the truncation error rel- 
evant to any Lagrangian formula can be estimated only roughly by mak- 
ing two or more independent calculations, based on different sets of 
ordinates, or by determining sample values of the divided difference of 
order equal to the number of ordinates used. It is apparent that recourse 








to the latter alternative would tend to nullify the computational advan- 1 a a 
tages which are inherent to the Lagrangian methods. However, when ; as a3 = (a2 — a1)(as — a1) (as — a2), 
equally spaced abscissas are used, divided differences of a given order can ila 


be calculated conveniently by use of simple formulas (see Probs. 7 and 8 
of Chap. 2), without resort to the formation of a divided-difference table 
or to the calculation of intermediate differences of lower order. Equ 
tion (2.3.2) is available for the same purpose in the general case. 

3.9. Supplementary References. References to tables relevant 


and use this fact to express the result of expanding the left-hand member of (3.2.3) 


with respect to the elements of th 
e first column, and equating the result to zero, 1n 


6. Generalize the result of Prob. 4 to show that 


la @ --- a 
Lagrangian interpolation, differentiation, and integration, correspondi as a; 1a ag 
to equally spaced abscissas, are listed in Appendix B. Salzer [262, 26 ' al oe a at | = (ag — ai)(a3 — @1)(as — G2) (a4 — G1) (G4 — G2) (a4 — Os) 
presents tables for inverse Lagrangian interpolation. Luke [256] ta La, at - vt ik ee ee Se 
ulates coefficients relevant to Lagrange interpolation using the coordina i. 
functions e(k = 0,1, 7 ee ). For derivations of the remainder term * ped the Lagrangian form of the interpolation polynomial from (3.2.3) in 
in the Newton-Cotes formulas, see §5.12 and Steffensen [18]. Integ, case. (The determinant involved here is often called Vandermonde’s 


: determinant.) 
tion formulas expressing ik w(a)f(x) dx approximately as linear combi 


tions of values of f(z) at equally spaced points are given by Kaplan [12: 
for several choices of w(x) which correspond to the presence of singul 
ties of various types in the integrand. Salzer [270] gives tables relev 
to the case (a,b) = (0, o),w =e, See also Chap. 8 for other meth 
of dealing with such integrals. A useful formula for trigonometric in 
grals is given by Filon [77]. Lagrangian interpolation by trigonome 
functions is outlined in Prob. 7; tables of the coefficients are given 
Salzer [269]. 


6, By considering the limit i 
of the three-point Lagrangian i i 
folative to xo, zo + €, and x, as e— 0, obtain the Pare aia alec 


_, (a — 2) (& +2, — 2. = a 
tiv) (e, = Be)? x0) f(o) + @ — 201 = 2) 912) 
(oe = 20)? 
Where 7 + (a1 — att) + E(x), 
E(x) = g(x — x)2(x — xi)f'’(é). 


yy pisdiectele a determinantal equation analogous to (3.2.3), but corresponding 
4 juirement that y(z) = Ao + Ai1cosz + A2sin x agree with f(x) h 

’ fy, t1, and ze. Then establish the identity ae 
PROBLEMS 


| 1 cosa, sin a 
Section 3.2 


1 cosa, sin dg 
1 cosa; sin a3 





= 4sin (ae =_ ai) sin A(a ee 
; : —a = 
1. By noticing that the zeroth Lagrangian coefficient function of degree n takes mae 1) sin (a2 — as), 
the value unity when = Zo and the value zero when 4 = %1, - + +) tn, and by © 


sidering the associated divided-difference table (or otherwise), show that 





wil une this result to express y(x) in the following form (due to Gauss): 
: 1 
sin’ g(a — a) sin r(x — 
Big) = a 46 sin g(x — a2) 
win y(ao — a1) sin y(x0 — 22) Toe) 








bs ze — xo , (4 — t0)(% — %) my (x — x0) + + (% — fmt 
lolz) = 1+ aa, F Geo — 21) — #2) ¥ + @— m1): fo — tn 
and that similar expansions can be written down by symmetry for the other coeffie 
functions. 


%. Derive the Lagrangian interpolation formula directly from Newton's divi 
difference formula, 


ae |G . 

sin g(x — 2) sin s(x — 22) 

rian f 2 
sin +(21 — ao) sin 3 (1 — 22) Fe) 





sin 4(z — 20) sin a(x — 21) 
sin 4(a2 — 20) sin 3(x2 — 21) ae: 
ow also that the formula resultin ( i 
g from deleting the y's in th 
oa (and due to Hermite) defines the approximation y = on + aciant o nd 
2 ¥ 0) 
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hich agrees with f(x) at the same three points. Also predict the form of the general- z(z* — ht) = 2 (2? — h*)2), show that 
which a: 


ization of the Gauss formula to the case when an approximation of the form 


B(e) = Hat — WY —h 0h] — Ff % et — WYYL—1,0,h,2,a] de 


i —h 
y = Ao + Arcos 2 + Arsins + se + Am. cos nz + Am sin nx 


and deduce that 
is to agree with f(x) at the 2n + 1 points Zo, Z1, - + + » Tam and verify the correctness 


: E(x) = gig(a* — h*)%f""(&1) — erg Bx" — 10A%* + 15hte + BhA)fiv(Es), 
of this conjecture. 


where & and £2 lie between the smallest and largest of —h, h, and z. In particular, 


Section 3.3 deduce the formula of Simpson’s rule (see §3.5): 


8. Prove that (3.3.20) is valid when z is outside the range R of the values 0, - - +» 


tn, by writing F(z) = f(z) — y(x) - Kr(z), showing that the function [re Pare bie) eee ee er M rat 


FO(2) = f(x) — y (2) — Kr (z) sae aaleth a hen, 


vanishes at least n — r +1 times inside R, showing that all zeros of x (x) lie inside 7 
and hence that K can be so chosen that F‘” (x) also vanishes when z = # if @ is no 
inside R, so that 


Section 3.4 


18. Determine the Lagrangian coefficient functions, in explicit polynomial form, 
BO (a) = Kx(2), relative to the bchaccin f(z) at the four points x = —2, =l 1, and 2. Use the 
results to obtain approximate expressions for f(0), f’(0), and [ of (2) dx in terms of 
(hose ordinates. 


14. Use the results of Prob. 13 to determine the equation of the third-degree 
polynomial passing through the points (—2,—5), (—1,—1), (1,1), and (2,11). 


and proving that, with this K, there follows 0 = pate — (n + 1)!K for some 

z. 

the smallest and the largest of Zo, - + + » Zn, an 

ea Chia the Lagrangian three-point first-derivative formula, in the case Mae ’ 
abecisns are equally spaced, at spacing h, and the origin is taken at the central poin' 





16. Use the Lagrange interpolation formula to calculate approximate values of f(x) 
in the form on +h when z = 1.1300, 1.1500, 1.1700, and 1.1900 from the following rounded data: 
1a) = 22! p(n) — 2% f0) + = FM + BO, 
f'@) oh? h2 2h x | 1.1275 1.1503 1.1735 1.1972 
with E(x) = 3(3a% — h)f"(hs) + gae(e? — WF (Ed), a= 


x 0.11971 0.13954 0.15932 0.17903 
where £; and & are in (—h,h) if x is in that interval. Show also that, unless hfiv F(x) 


large in magnitude relative to f’"', the error is least, on the average, at distances 
ut 0.6h from the central point. ; 

a By integrating the Lagrangian three-point formula, when the ee, are 

equal spacing h, with the origin taken at the central point, obtain the formula 


16. Use the results of Prob. 15 and the coefficients of Table 3.1 to determine approxi- 
inate values of f(z) for = 1.1600(0.0010) 1.1700. 

17. Under the assumption that the data in Prob. 15 correspond to the function 
/(p) = sin (log x), obtain bounds on the truncation errors associated with the values 
ealoulated in Probs. 15 and 16. 

18, Obtain bounds on the round-off errors associated with the values calculated in 
Vrobs, 15 and 16. 

19. Use the table of five-point Lagrangian coefficients given in §4.12 to interpolate 
ii that table itself for the coefficients relative to s = 0.38, 0.05, and 1.93, rounding the 
fewultu to six places. If no round-offs were effected, what errors would be present in 
tho calculated coefficients? 

40, Show that, if h3|f’”’()| does not exceed 16 units in the last place to be retained 
i & three-point Lagrange interpolation based on equally spaced abscissas with spacing 
4, (hen the truncation error cannot exceed one unit in that place. 

&1, Show that, if h®|fv(z)| does not exceed 32 units in the last place to be retained 
i) « five-point Lagrange interpolation based on equally spaced abscissas with spacing 
4, (hon the truncation error cannot exceed one unit in that place, and also that 
M'\/*(r)| may be as large as 84 units if the interpolation is effected only between the 
weond and fourth of the five successive abscissas. 


Heotion 3.5 | 


W4. Prove directly, from Eq. (3.5.6), that a Newton-Cotes formula of closed type, ~ 
ploying n + 1 ordinates, is exact when applied to any polynomial of degree n + 1 


3 — 3ha? + 22° Qh? + 3h2x — 23 
[ope ae = 100) 
_ hi — 3ha? — 22° 


ne Sth) + Bl 





and show that the truncation error is expressible in the form 
E(z) = [ae — h2)f[—h,0,h,2] dx. 


11. If the upper limit of the integration in Prob. 10 does not lie outside the inte’ 
—h,0), show that 
aa E(z) = delet — YY", 


where —h <&<h. In particular, deduce the formula 
hs ver 
[OPPs ae = Fy lBslen + a1le0 + — shee + 20+ es" 
zo 


< § < 20 + 2h, after a change in notation. 
— By oie the oxpression for (x) in Prob, 10 by parts, and noticing t 





———— ll 
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hen n + 1 is odd [Notice that F+» (£)) is then constant, writes = @ + (n/2), and 
whe ; 


i i i f t.] 
esultant integrand is an odd function 0 ; ; 
pe pret the factor s — (n/2) can be replaced by s — ¢ in (3.5.9b), where c 1s 


t (see Prob. 22). ; 
Oe es a formulas resulting from neglect of the error terms in (3.5.18) and 


‘ . . -) f 
ee ae that the truncation error associated with a Newton Cotes formula o 


closed type employing + 1 ordinates can be expressed in the form 


E =hr? iM s(s -—l1)- °° (s — n)flxo, . + + 1%n) Zo + hs] ds, 


i - inates is 
whereas that associated with a formula of open type employing  — 1 ordinate 
given by 


perf @-0 on be = Vfl cs tenn teh Mel de 


: P a 
26. If frorpn/2 denotes the value of f(x) at an abscissa midway between 2x 2 


tea = te +h, derive the formulas 


[Ese ae =m +10 + Be 


[ire de = F On +n +3) + Br 


Maclaurin. It can be shown that 
las are the first two of a set due to ( 
Dr Ona (to < — < 22) and that EB. = 21hsfiv(£) /640 (to < & < 23).] 
1= 


Section 3.6 
27. Given the following rounded values of the function 


ja - er, 


calculate approximate values of the integral 


P(l) = 2 I e#/2 dt 


1 
by use of the trapezoidal rule with h = %, a and g; and comp 
rounded true value 0.68269: 


are the results with the 





x f(x) x f(x) 
0.000 | 0.79788 0.625 | 0.65632 
0.125 | 0.79168 0.750 | 0.60227 
0.250 | 0.77334 0.875 | 0.54411 
0.375 | 0.74371 1.000 | 0.49304 
0.500 | 0.70413 


28. Repeat the calculations of Prob. 27, using instead the parabolic rule, and co 


lts with those of Prob. 27. ; 
are eee the calculations of Prob, 27, using instead the Newton-Cotes thre 
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five-, and nine-point formulas of closed type, and compare the results with those of 
Probs. 27 and 28. 
30. Calculate approximate values of the integral 


1 
if ers 2ae dx = [4(1) = 1.266066 


by use of the trapezoidal rule with h = +, 4, and 4, retaining five decimal places, and 
compare the results with the true value. 


31. Repeat the calculation of Prob. 30, using instead the parabolic rule, and com- 
pare the results with those of Prob. 30. 


32. Repeat the calculations of Prob. 30, using instead the Newton-Cotes three-, 


five-, and nine-point formulas of closed type, and compare the results with those of 
Probs. 30 and 31. 


88. Obtain an approximate evaluation of the integral 


lcos x 


0 Vz 





dx 


(a) by writing it in the form 


4 - ft = 608 Fp 
0 Vz 0 V2 E 
ovaluating the first integral analytically, and applying the parabolic rule with h = + to 
the second one, and (b) by making the change of variables x = ¢ in the original form 
and applying the parabolic rule with h = % directly to the result. 

Also compare the approximations with a more accurate value obtained by expanding 
ihe integrand of one of the forms in a power series and integrating term by term. 


Section 3.7 
84. Show that 


Yn+l 1 
I, s(s —1) +--+ (8 —2m—1) ds = ag fpse-D- + + (s —2m — 2) ds, 
whon mis a nonnegative integer. (Express the left-hand integral as a sum of integrals 
hwtween successive integers, translate all lower limits to zero, and show that the 
“nm + 1 terms in the resultant integrand can be telescoped into the sum of two terms. 
‘Thon replace s by 1 — s in the integrand of one of those terms.) 
86. Show that the numerical factor in (3.7.1) and (3.7.2) can be expressed in the 








form 
_ [(2ms(s —1) +: (s — 2m)(s — 2m — 1) 
Tim = i} (2m +2)! ds, 
when nm = 2m, and in the form 
1 a 
att = fo (2m + 2)! id 


When n = 2m + 1, and show also that 


Tamar os Ton - ih s(s — 1) + + + (8 — 2m — 1) ds. 


———————<———— "| 
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36. With the abbreviation 
pH st (s —k+)) bs 
1 
0 ! 


an = 


k 


show that the results of Probs. 34 and 35 lead to the relations 
I 2m 


Tami 


= —Qarm+s — A2m+2) 


—2a2m+3; 


and deduce that the err’ : 
employing » + 1 ordinates, can be expressed in the form 


— Zorn ssh 8h ot (8) (n odd), 
ag { — (Zonya + ons2)hrref (n+2) (£) (n even). 


7. The constant ax defined in 
sane and is often denoted by Bw (1)/k! or by B® /kL + B 


known (see Steffensen [18]) that 
(—1)*? 
ot ~ k(log k)? 


(k— @). 


i i ion for En is approxi- 
i i that the numerical factor in the expression 
ae po when n is a large odd integer, and by —1/[n(log n)?] when n 


mated by —2/[n(log 
is a large even integer. 


Section 3.8 
38. From the f ollowing rounded values of f(z) 


formulas, estimate the errors, and check the validity of the estimations: 
’ 


x 1.0 1.1 1.2 1.3 1.4 


_ 





_ 


f(x) | 0.2500 0.2268 0.2066 0.1890 0.1736 





39. Values of a function f(x) are to be determine 
positive values of x, and are to be used for the ap 


calculated values can 


squares of predictable upper bounds on the truncation an 


d calculate t 
Seaton with that in which only three ordinates are to be calculated. 





or associated with a Newton-Cotes formula of closed type, 


i ible as a generalized Bernoullt 
Prob. 36 is expressible agin aire 


=(1+2), determine approximate 
lues of f’(z) for « = 1.0, 1.1, and 1.2 by use of appropriate three- and five-point 
valu ‘ F 





d for x = O and for four additio 
proximate determination of f’(0) 
i d that the accuracy of t 
i bscissas are to be equally spaced an : 
eeeacenem a anand only within 1 per cent, and supposing hes ‘ 
true function is f(z) = 1/ +2), determine the spacing for which the sum 0 







he corresponding upper bound on the total error. Also compare t 








CHAPTER 4 


FINITE-DIFFERENCE INTERPOLATION 


4.1. Introduction. This chapter returns to the consideration of for- 
mulas expressed in terms of differences, rather than of the ordinates 
themselves, but deals only with the cases in which the abscissas are 
qually spaced. Here the rather cumbersome notation of divided differ- 
ences is not needed and is replaced by other notations which are explained 
in §4.2. 

The most important of the interpolation formulas which involve dif- 
ferences, together with error terms, are derived in $§4.3 to 4.7, and their 
roupective uses are discussed and illustrated in §4.8. In this connection, 
it is of some historical interest to note that the formulas bearing the names 
of Gauss, Stirling, and Bessel were apparently first known to Newton, 
while the formulas attributed to Newton (§4.3) are due to Gregory. 
\\urther, Everett’s first formula is due to Laplace, and Everett’s second 
formula was apparently first given by Steffensen. 

‘The propagation and detection of errors in given data are considered 
jn §4.9, whereas a useful method of taking certain higher differences into 
approximate account, by modifying certain earlier differences, is illus- 
(vated in §4.10. 

‘The concluding section of the chapter provides some information con- 
verning the behavior of the error term in certain interpolation formulas, 
ae more and more differences are retained, and indicates the practical 
significance of that information. 

4.2. Difference Notations. When data are tabulated for uniformly 
spaced abscissas, with spacing h, it is convenient to express formulations 
for interpolation and related processes in terms of the differences them- 
wlvos, rather than the divided differences used in Chap. 2. 

Wor calculation near a tabular point 2» at the beginning of the tabulated 
fange, it is conventional to define the forward difference Af(xo) as 


Af (xo) = f(xo + h) — f(to). (4.2.1) 
If uluo Af(wo + h) = f(wo + 2h) — f(wo + h) is known, then the second 
forward difference associated with 2» is defined as 
~ ANY(wa) = Af(aro +h) — Af wo) = feo + 2h) — 2fleo + h) + flee), 


(4.2.2) 
91 
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: . — 
and succeeding forward differences are defined by iteration. More g 
erally, we introduce the definitions 


Af(z) = f@ +h) — f@) arty(x) = af + h) — AF@), (4.2.3) 


; rane d 
the spacing h being implied in A. Tf a more specific notation 1s needed, 
e 


be used in place of A. a 
oe ieee differences are used, it 1s convenient to number 


. . . . 
abscissas To, V1, ae. 8 in increasing algebraic order, sO that 


toy = te th. (4.2.4) 
Then, with the notation of §2.3, there follows 


= — Xk [arn,Ce+1) = hf (xe, te+1); 
Af(an) = f (ae jy- f (ae) (ess af Bio, + Se1Ze a] 
beh = Mleaeazend — flare teri) = h(ter2 — % Ce 


and, in general, induction shows that 
Arf(ax) =  — D Whr-f[tesa, © + Tete] — (r — Lin 'flae, - - Cera} 


(r — LW err — pf lite - «+ Tere 
rinfltr, - + > Cher. 


ll 


ll 


. : -. indicated 
inni ding difference table is indica 

The beginning of the correspon = ee as eaitanal wa) 
We notice that the subscript remal 


t% Jo—_ long each forward diagon 
bfy —_ constant along : 
ay ht — Aft a Af, of the table, and that the region 


—~a¥,— > determination of Arf is bounded b 

the kth forward gS a 

th backward diagonal. > 

: : _ ner neae Arf, depends upon t 

es to. 4. ordinates fi, fers « - «> Sun #8 

also indicated by (4.2.5). 

For calculation near the end of a tabulated range, cc os of bac : 

ward differences is often more convenient. Here we 


vf(x) = f(z) — f@ — h), vrtyf(x) = Vf(e) — Wi@ — h). (4.2. 
If the abscissas are again numbered in accordance with (4.2.4), th 


follows 


= f(y) — fltea) = (te — teafltmtea] 
Vf (ae) = fee) — F@ ss aiaiaal 


i ral, 
as ee Vef(an) = TW SlaayTr—ry + + + tr-ely 


in analogy with (4.2.5). 
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The end of the corresponding difference table is indicated in Fig. 4.2. 


Here the subscript remains constant along each backward diagonal. Also, 
it is seen that the difference Vf, de- , 


pends upon the ordinates fi_,, frr41, ° : 
...y, fz, a8 is also indicated by 
(4.2.7). _Vfn-2 —. 

For the remaining calculation, the 7"? Inna 7 Me i ie 
notation of so-called central differ-  %n-1 Sent. pies vp In 
ences is usually most convenient. If _ Vn = 
the calculation is to be effected near 7* 

Sut , a nee Fia. 4.2 
certain interior tabular point, it is 
convenient to number that abscissa as vo, and to number forward abscissas 
As v1, 22, . . . and backward abscissas as x_1, t_2, . , 80 that (4.2.4) 
again holds. In the central-difference notation, one writes 


= 
Zr-g  Sn-3_ i 


f(x) = fie + gh) —fle@— gh), ot 'f(x) = Of(e + gh) — Of — gh). 
(4.2.8) 
It is seen that 6f, = df(a.) generally does not involve tabulated ordinates. 
Ilowever, the second central difference 
Bz = df(xe + ah) — df(ae — gh) 
= [f@ze + h) — f(re)] — [f@e) — flte — h)] 
= fir — 2ft + fe-1 


oes involve tabular entries, and the same is seen to be true of all central 
differences 6?”f; of even order. Furthermore, we may notice that 


Sry = Seri — fr 


wid, more generally, that 6?”"+!/,,,; involves only tabulated arguments. 
With the notation of §2.3, we may write, for example, 


fy = fi — fo = hflxo,z1], df_4 = fo — fu = hflzo,r_1], 


8°f1 = Of; — df; = hAf[xi,x2] — Af[xo,x1] = 2'h2f[x0,21,22] 


fd, in general, 


Bay = h?e+1(2m + 1) !f[em, see ky ee Ck+myCk+m+1], 
(4.2.9) 
Oth, ;= h2™+1(2m + 1) 'f[@e—m—1,2k—m, ices, UCR aces occ Ckiml, ft 
(4.2.10) 
mid = 88/, = h2"(Qm) !f[apm, 3 6 ky) + Chem. (4.2.11) 
The portion of the corresponding difference table in the neighborhood 


i interior tabular point 2, near which calculations are to be made, 
indicated in Fig. 4.3. Here the subscript remains constant along 


sontal lines of the table, which pass through differences of only even 
only odd orders, 


o 


—————<—————— 
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Thus, once a set of adjacent entries in a difference table has been num- 
bered, three different sets of notations are available for the differences 
themselves, as may be seen from the composite Fig. 4.4. Any one of 
these sets of notations would suffice. However, each possesses certain 
advantages in certain applications, as will be seen. 


The formula takes on a si w uce a dimension 

: a simpler form if we introd i i 

variable s defined as distance from 2p in units of h [oe 
? 


s= vr Xo 
i x=2o +hs. (4.3.4) 


Since then there follows al — 2 = 
Oe eee so x — a = h(s — k), the preceding formula 


s(s — 1) 





gy S-1 _&f-1 
Ps _ F-1p— F fa peae fi = fo + 8 Afo + gy A’fo + a: was 
0 o~ af _, a —s; —— 6 fo ! 
= tock 1/2 en 1/72 4 s(s—1) ++: (s—n+1) 
Shp nl A'fo + B, (4.3.5) 
te de where E, = _h 
: ey 8 — s(s — 1)(s — 2 see 
(n+ 1)! pis @—aferr(e, (43.6) 
e Fra. 43 and where we have written 
: fe = f(a + h = ee 
ad " (vo + hs) = fla), By = El@o + hs) = E(@). 
0 «(So eaten : ie result of neglecting the error term 7 % f, 
mf A%o= Pn V7e fein known ae Newton sforward-digerence. aie 
od Afy= ofs/2= Vie form ula for interpolation. It makes use ta Sf; Me ary 
eo Si of the difference path indicated in Fig a ff Af; ° 
: 2 


4.5. 
I ; a similar way, if we require a for- 
mula successively introducing the ordi- Fia. 4.5 


hates at x ) x —1, Ln-2 and so for th we may replace x by x x by Ln— 
nN Ns n—2y 
( ) 0 ny 1 n—1, 


Fia. 4.4 


4.3. Newton Forward- and Backward-difference Formulas. In ordi 
to obtain an interpolation formula such that the retention of n+ 


ynomial of degree n taking on the values of f(x) 


terms leads to the pol 
to, ti = tooth, + * ta = 2) + nh, we may refer to Newton’s divid 


difference formula (2.5.2), making use of the relation 


I(x) =f sh " (x — 2n)flan,ta-1] + (© — tn) (@ — La-1)f lan, Fn—1,Fn—2] 
es GN ao hey hts Mee acy oe nl + E(e) 
and, writing here 









_ %— Tn 


1 
f{xo, GS 2x] = rin A’fo; (4.3.1 
s= 7, ? x=IZn + hs, (4.3.7) 





which follows from (4.2.5), to obtain the result & 
: © may use (4.2.7) to reduce this result to the form 


sta) = fet e288 + eOle — 2 a ; 


Gm) A+ EO, 48 


ine fut Vpn + EAD yep, + wate 








+ (@ — o)(@ — #1) * nih” pied Dees tea Des 

2 a . fiske iy fore (E). ! "f, + BE, (4.3.8 

where E(x) = (x — %) (@ — %) (n + 1) ! (Ae whore E, = ie " rack : 
mean ee th) - + ayerr). (4.3.9) 


and where ¢ is in the interval occupied by ®o, » + + 1% and 2, in ace 


ance with (2.5.3) and (2.6.6), This formula is known as Newton’s backward-di. 
difference formula, when 


NH, in neglected, and it utilizes the difference path indicated in Fig. 4.6 





————=——  i##@—- | 
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Tf r+ 1 terms are retained in (4.3.5), the polynomial agreeing with 

f(x) ab to, t1, + + + 9 is obtained; the retention of r + 1 terms in (4.3.8) 

yields the polynomial agreeing with f (x) at tn, Ln—1, + + + Lar Iin-+1 

terms ‘were retained in each formula, the two formulas would involve the 
same ordinates and would yield the same 
polynomial approximation. 


In-g Sn-2 More generally, the former would be 
Mae Viet gp er used near the beginning of a tabulation 
eels —7 In (at which only forward differences are avail- 


oe able) and the latter would be used near the 
end (where only backward differences are 
available). In particular, the backward- 
ae difference formula is especially useful in 
extending a tabulation, and for generating other formulas useful for 
advancing numerical solutions of differential equations. For this reason, 
3 was measured forward in the table in both formulas, so that it is positive 
for extrapolation in. (4.3.8), whereas it is positive for interpolation in 
(4.3.5). Fither formula can, of course, be used for either interpolation or 
extrapolation. 


<«<—-«— 


fia. 4.6 


The formulas can be written: in more concise form in terms of the 


binomial coefficients 


() tre _" eee AY, (4.3.10) 





k 


With this notation, the forward-difference formula becomes merely 


fa = fo + hs) ~ . (;) AFfo. (4.3.11) 


Further, the coefficient of V*f, in (4.3.8) is seen to be 


(oreo 1) eet Gee 
k is Kk! 








Sep Ea (ua ole 41) 


eit (;:) (4.3.12 


so that the backward-difference formula takes the form 


(0 \ t 
by eed 


di ssf m fleet Me Y (1)! Las) Vie (Aull 


n 


he q 












I(x) = fo + (@ — 20) ah + (a — 2o)(x — 21) wk, 
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or, alternatively, 


ns = f(t, — hs) = ) (—1) (;) V¥ fa. 
» : (4.3.14) 


Extensive tables of the coefficient functions may be found in the li 
ature (see references in Appendix B). A brief table, for interpol ti io 
Barone by tenths, through fifth differences, is ingluded a 4.12 i 
Rc seen RnR en ers armor ore Pepe 
correspond to abscissas which are as ae egy ieee 
- end of the tabulation, the Newtonian formulas of the erent ese 
ea serve this purpose as well as is possible. Otherwise, it is convenient 
o start with the abscissa x» nearest %, then to introduce re 
vy and x_», and so forth. eee 
If the ordinates are intro i 
result. of replacing 2o, %1, ce ees . : a : eae foes oH 

» 1, L1, U2, 2, ... mM 


(2.5.2), and the subsequent : 
ae ae quent use of (4.2.9) and (4.2.11), with k = 0, leads 


2th? 
+ (a — x0)(@ — 21)(@ — 2-1) aes 


+ (x — ao) (x — 41)(a@ — 2-1) (4% — 22) 54fo ee 








If we write a 
oe X— Xo 
ar tae ’ x= Xo + hs, (4.4.1) 
this result takes the form 
s(s — 1) etl te 
r= fut s ify + ES ayy + IY ayy, 
s(s? — 1?)(s — 2 
+ oe 5“fo tere 
2 2 m— 1: 
+ a6 — 1%) ++ +.(e —m— 1')(s — m 
( (2m)! —_ anf 
re s(s? — 12) - - + (s? — m?) 
one smh 
® (4.4.2) 


whore, if mth differences are retai = hen ven and 
: tained, n = 2m when n is e 

F e 

ii = 2m + 1 when nis odd. The error term takes the form 


E, = james 8(87 MONA*) 6100 gh ih 


Somali taeted 12. (4.4.3) 


———————<<—————— —  i-~" 
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when n = 2m, and the form 
s(t 13) (s? — m?)(s — m — 1) fom (8) (4.4.4) 
(2m + 2)! 





E, Sy h2mt2 


when n = 2m +1. . - 
* This formula employs the forward zigzag ee path indicated in 
i is known as Gauss’s forward formula. ; 

i vi ae similar way, by introducing the enn ee an a 
e font fo, fr fis fa» fo, . + + using (4.2.10) and eee = = 0, 

a again introducing the abbreviation (4.4.1), we obtain the fo 

| s(s + 1) 8fo + s(s? — 1°) BY 4 

fo = foto ifat op Oe Br 


qe yet?) Sfo tcc? 














2 — 12 ee (s? — m — 1)(s + ™) 6°"fo 
ns s(s ) Gmyl f 
s(s? — J?) #2 (s? — m?) aemtif_y 
uf eg (2m + 1)! (4.4.5) 
sien eer eer 
where E, = htt (2m + 1)! / : “5 
212) +--+ (s?- m?)(s + m fom+® (E) 
or E, = hem? Ms (2m + 2)! 


(4.4.6 


with even or odd difference 


; i inated 
according as the formula is termi! difference path in Fig. 4.7 am 


This formula utilizes the backward zigzag 
is known as Gauss’s backward formula. 


Tag f-2 
tey Soa vio Pfs 
tw to aon Se, 
0 afte Ph 
‘ afs/2 


3 

Fe bts eae 
T > 8/2 
8 

,” 

tp Se 


Fia, 4.7 


When terminated with an even difference, of order 2m, ate 
yield the polynomial agreeing with f(a) at ®o, Ter, * . } baer 7 
are completely equivalent in that case. However, W € ae 
an odd difference, of order 2m + 1, the forward formula giv 
























Which appear in this formula. 


FINITE-DIFFERENCE INTERPOLATION 99 


mial agreeing with f(x) at Yo, 41, . ~~ , im, ANd m41, whereas the back- 
ward formula yields agreement at the first 2m + 1 points and at x_m_1. 
In this latter case, when seeking f(%), the forward formula would be 
expected to afford somewhat better results when % is between 2 and 2, 
whereas the backward formula would generally be preferred when Z is 
between zy and 2_1. 


With the notation (4.3.10), the Gaussian forward formula can be 
written in the more compact form 


pett(J)nt(Qjen+ (CS )ont( tb )omt---, 


(4.4.7) 
whereas the corresponding backward formula becomes 
f= fot @) of + E 4 2 8%fo + fe pe ') of 
ae (° si *) Bot. (4.4.8) 


Neither of these formulas is of frequent practical use, but from them 
other more useful formulas may be derived. 

4.5. Stirling’s Formula. When interpolations are to be effected for 
values of Z near an interior point xo, say, between a — gh and 2 + gh, 
i formula of frequent use may be obtained by forming the mean of the 


(jvussian forward and backward formulas and so introducing a symmetry 
about the abscissa xo: 


f= fo + 5 (fy + 8a) + x5 [le — 1) + H+ D1 8% 


+ a (on + oy) 
+ Pe — 2) + @ +21 dot ++ - 








4 ESE) aE) Eos — m) + (6 + m)] afo 
igs eh s(s? eer ae m?) (s2mtif, 4. 52mtif_,) 
+ E#,. 


(4.5.1) 
It is then convenient to introduce symbols for the mean odd differences 
The notation 


we) = 5|s(2 +4) +5(2 - | 


often used, so that, for example, we may write 
Hofo = Cf) + fy), — wdYo = F(S"fy + S%f4). 


(4.5.2) 


(4.5.3) 


————— 
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With this notation for the so- 
(4.5.1) takes the form 


called mean central differences of odd order, 





2 2 12 2/02 — 12 
f= fot subfe +o othe = 19 gorpe + SEG He + ere 








$2(s? — i (s? — m — 1°) is 
(@m)! binfo 
s(s? =, 1?) $b pe (s? = m?*) ere 
OF (am + 1)! porn fo 
1 K,, (4.5.4) 


ormula for interpolation. 


E, is known as Stirling’s f 
ated with terminating (4.4.2) and (4.4.5) with 


tical, there follows also 


The result of omitting 
Since the errors associ 
an even difference are iden 


s(s? —_ 1”) see (s? _— m?) Pa 


(4.5.5) 





E, = hem} 


when n = 2m. As in the preceding cases, £is intermediate between th 
largest and smallest of the abscissas involved in the formula (here 2 


Bans and 2). 
f the errors (4.4.4) and (4.4.6 


41, + 
= 2m + 1, the mean 0 


However, when n 
takes the form 


s(s? — 1?) oe (s? — m’) ae 
2(2m ue 2)! [(s —m — 1)f° a )(E1) 
+(st+m+ 1)fom+ (&)], 4.5. 


including %o, %41, +: * 
terminated with an 0 





E, = hi? 


where both & and £ lie inside the interval 
Thus, when Stirling’s formula is 


C4(m+1)) and x. ave 
difference, the error term does not take a simple form similar to (4.5.5) 
of-1/2 5°f-1/2 
7% ft eh fy — 
& aft/a BY i/2 
Fia. 4.8 


It should be noticed that the interpolation polynomial of degree 2m + 
th f(x) at the 2m + 


which is yielded by the formula in this case, agrees wi 
but that an additional (2m + 2)th point 


points 2, T41,- ++) L swt 

agreement (which would serve to specify the polynomial) is not kno 
The Stirling formula is equivalent to either Gaussian formula wh 
terminated with even differences. But even in this case its form 
more convenient because of the fact that the coefficients of the differen 
of even order are even functions of s, whereas the coefficients of the m 


differences of odd order are odd functions of #, With the notation 
















FINITE-DIFFERENCE INTERPOLATION 101 
(4.3.10), it can be expressed in the more compact form 
Me s s[s 
j fot (2) nate + §(t) aye + (841) aarp 
s(st+1 
+4( 3 ) ayo + eine (4.5.7) 


A brief table of the coeffici i 
; cients is presented in §4.12. M : 
sl oe eee in the literature (see references in io 
6. : x 
pene Af Formula. Whereas Stirling’s formula is ne 1] 
See 7 an erpolation near a tabular entry x, the need f a 
eres = ee Sa for interpolation aver the ea 
zo and 2;. In order to obtain a for i mas 
: ; . mula in whi 
ee involved is symmetric about a horizontal line aa irae ta 
vp and 21, we again make use of the Gaussian forward ocala. between 
s(s — 1) s(s — 
a Bf + DEFY 
which involves the differences 
secaoe along the forward zi ; 
: . wer ae gzag of F 
sen it with a formula which involves the ines items a 
ard zigzag of that figure. The latter formula may be Foner 
mos 


Se = fo t+ 8 Of, + defy + +++, (4.6.1) 


T Xm to ; Bf 5 
[-—> 1/2 —>}-—> & 
; y Sh ° sf, en aad 
Fia. 4.9 ; 


easily by noticing that, if s 

; were to be measured 
rare red from 21, th 
Hi a diac by advancing all subscripts in the Cee 
ee %, i Lied if 8 is to be measured from Zo in both fader 
wha nce the subscripts in (4.4.5) by unity and, at th ee 
eplace s by s — 1, to give the result gis, 











f,=fit(s—1)5 (s— Ds ,, —1 = 
) if + 91 af, + CAUSE) poy go 
The mean of (4.6.1) and (4.6.2) then takes the form ses 
if = ear Te! s(s aan | _ 1 
os E vat x uo°fy + es SY, ts. 
4 2 aay es a = eS an 
(2m)! uonfy 
permite) Qin in 1) — m)(s— 4 
-), 
+2, Gm +0)! 2 gems 
(4.6.3) 


and is known as Bessel’s formula, 


—————=<——— i #é-~“ 
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When terminated with an odd difference, of order 2m + 1, both (4.6.1) 
and (4.6.2) yield the polynomial of degree 2m + 1 agreeing with f(x) 


when x = %o, T41) - , Lam, and Im+1- Hence the same statement 


applies to Bessel’s formula, and the error in that case is consequently 
identical with (4.4.4), 
— pamee s(s? — 1") - °° (s? — m*)(s — m — 1) panies 

E, h (2m + 2)! f (8), (4.6.4) 
when n = 2m +1. However, when Bessel’s formula is terminated with 
even differences the error term is obtained as the mean of (4.4.3) and the 
first form of (4.4.6) with s replaced by s — 1, noticing that the parameter 
£is not generally the same in the two expressions, in the less simple form 





pK eta am — De 
p, = we C= YT L)(8 — f(g + msor (E) 
4 (3 — m — 1fom? Es) | (4.6.5) 
when n = 2m, where {1 and £2 lie inside the interval including %o, 71) 
wee y Lm; Lm-+1, and 2. 

A brief table of the coefficients appearing in (4.6.3) is given in §4.12. 
More extensive tables are available in the literature (see Appendix B). 
Whereas this form therefore is the one most often used in practice, the 
symmetry about the mid-point of the interval (20,21) becomes more 
evident if we write s = t + 4, so that 





_ 2 — 9(to + %) ogee 
Soe is 8-5 (4.6.6) 


and hence t is distance measured from that mid-point in units of h. It 

is readily verified that Bessel’s formula then takes the equivalent form 
f—4 (2 — 4 

fa = oh toh tata + Ee on 


a “9 wy ANG pe ae 2 
pCa DE =D yay, CREE RY ot s+ + (4.6.7) 


where the terminating term and the corresponding error term are obtain- 


able by introducing (4.6.6) into the forms given in (4.6.3) to (4.6.5). 


Thus we see that the coefficients of mean even differences are even func= 
tions of t, whereas the coefficients of odd differences are odd functions of t. 
An important special case results, by setting s = 4 in (4.6.3) or ¢ = 0 


in (4.6.7), in the form 


f= Hoo + fi) — aCe + Oi) + bolSife + Hf) — wbes Oe * afi) 
, cep (pe Se — 1)! (samp, + 2f:) + Ey (4.6.8) 





am+1(2m)! 
B+ (2m + 1) 


where ry od (—1)" {1 ation " | some) (8), (4.6.9 











Peli = ois eee sf, — Et Vals — NG = 2)(8 — 8) 
! 5! | 







whore HZ, is given by (4.7.2). 
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d ae 
and where x» — mh < & <2, -+ mh. This formula is known as the for- 


mula for interpolating to hal i i 
of data. = ves and is particularly useful in subtabulation 


With the notation of (4.3.1 : 
eee (4.3.10), Bessel’s formula (4.6.3) can also be writ- 


fe = fy + (8 — 2) Oy + (3) waif, +253 co af, 





3 
ao) F pee ote) : 
1) +a pat +: 46.10) 


4.7. E i 
a ee ons er ee ee auxiliary tables of 
ers (usually 6? and 64 i 
: ‘ y ) are provided. 
ais ied an interpolation formula which ace. only as ‘ 
sci jaan ye order, we may, for example, start with the Gauss tor: 
a, terminated with an odd difference and written in the form 





wo ae 
f= (fot sf) +o (se pitt a) 


s(s°— 1?) (8 — 2 
$APHTVC— 2) (yy 242 ary) 4 --- 








s(s? — 12) - + + (8? —-m—1 
+ a (F< m— T=) (army, 4 SAM seers) +, 
sae (4.7.1) 
E, = hime? s(s? — 12) - - + (s? — m?)(s — m — 1) 
reat fomen(). (4.7.2) 
If we now make use of the relations 
of; = fi — fo, 5%fy = 5%f1 — S%fo, my (4.7.3) 


this formula becomes 














5fo 
_ _ +m — I(s+m-—2)-++@-—m—1) 
(s + 1)s(s — 1 (2m + 1)! dinfo 
+ af + S48 ) gay, 4 CHWS + ne = D(s = 2) gap, 
shi eM) (ein —!4) - - - G2 
(2m + 1)! 5'f, + H,, (4.7.4) 


The interpolation formula resulting from 


———————=—<<——————  i#@=— | 
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neglect of E, is known as Everett's first formula (or frequently merely as 
Everett's formula). 

In ne of using the differences yd"f, and 5+*f, ee are ager _ 
Bessel’s formula, it uses the differences 6f) and 6 ‘fi. However, : id 
seen that the result of terminating Bessel’s formula with the ae + me 
difference must give a result identical with that of terminating i s 
first formula with the two (2m)th differences, for both of these formu ge 
equivalent to the Gauss forward formula terminated with the es + 1) 
difference, as may be verified directly by comparison of the error 

? 
eae the same number of terms must be evaluated in using sa two 
formulas just mentioned, if tables are available which include di ie 
of orders, say, two and four, then the use of the Everett caused ai ‘ 
a calculation taking into account all differences through the fifth withou 
i i he computer. 
eed of differencing on the part of t 
ee table of the coefficients is provided in §4.12 (see references in 
i les). 

Appendix B for more elaborate tab 

‘7 a similar way, a formula involving only differences of odd order can 
be obtained from the Gauss forward formula terminated with an even 
difference, 


2 12 
a(o = 1) (amy + 


§ 





—-2 eee 
i; = fte(y tts) + 4 iyo) + 











—-m-—1 
s(x? — 1) + (s? — m?) (seme a oleae sem'tjo) + E,, 
remy pias (4.7.5) 
rere B= mee ED pomeniey, 18 
| a m . 
by writing 


BYfo = ify — fa BYfo = By — OT (41 


The result takes the form 








4+ 2)(s + 1)s(s — 1) gs nee 
ja fet SEP an eS )(s : sf, + 


+m+i1j(s+m)-:: &- m) g2mtl 
een (2m + 2)! 4 





_ s(s — 1) of Whe (s _ A)s(s | . bef TN ia Be 
— (8 + m)(8 + —_= ) - aC — \) 62+ uh j -+- D (4 7 8 
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where E, is given by (4.7.6). The result of neglecting E, is known as 


Everett’s second formula (often also as Steffensen’s formula), but has not 
found much favor in practice. 
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The result of terminating it with the 
(2m + 1)th differences is equivalent to that of terminating Stirling’s 
formula with the (2m + 2)th difference. A brief table of its coefficients 
is provided in §4.12. 


If we introduce the notation of (4.3.10), Everett’s first formula can be 
put in the form 


f= daft (O° 9 FY a4 (> 2 +?) aye oes 


shia (’ “ Z 52f, + C > *) bf: t+: ++, (4.7.9) 


so that the coefficients of one line are obtained by replacing s by 1 — s 
in those of the other line, whereas Everett’s second formula becomes 


bone ar(P Ime 


7 er ‘) tae ery ’) ae 


so that one set of coefficients is obtained from the other by replacing s 
by —s. 

4.8. Use of Interpolation Formulas. As was mentioned earlier, the 
Newton formulas with forward and backward differences are most 
uppropriate for calculation near the beginning and end, respectively, 
of a tabulation, and their use is mainly restricted to such situations. 

The Gaussian forward and backward formulas terminated with an 
wen difference are equivalent to each other, and to the Stirling formula, 
lerminated with the same difference. The Gaussian forward formula 
lerminated with an odd difference is equivalent to the Bessel formula ter- 
ininated with the same difference. The Gaussian backward formula 
luunched from zo, and terminating with an odd difference, is equivalent 
\o the Bessel formula launched from z_1, terminated with the same differ- 

ence. Thus, in place of using a Gaussian formula, one may always use 
i equivalent formula of either Stirling or Bessel, for which the coefficients 
fire extensively tabulated. 

lteference to (4.5.4) shows that the coefficients of all differences of even 
order in Stirling’s formula involve s* as a factor. Thus, for interpolation 
wonr xo, it may be expected that the result of terminating that formula 
With a mean odd difference 462"**fo will be nearly as accurate, on the aver- 

fue, a8 the result of retaining one additional difference, 


(4.7.10) 


However, the 


——————————  -~. 
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relative complexity of the remainder term (4.5.6) in that case is somewhat 
of a disadvantage when a precise error bound is required. ea 

A comparison of (4.5.6) and (4.6.4) shows that, in addition to com 
factors, the Stirling error involves the factor 


U(m + 1 + s)femt?(é2) — (m +1 — s)fem(&)1, 
whereas the Bessel error involves the factor 
—(m +1 — s)fom* (Ei). 


my < M for t_m-1 S&S Tm; the Stir- 

+t ig known only that |f® +2)(r)| S 

a scte can be guaranteed only not to exceed (m 2 Ha in ae 
t exceed (m + 1 — s)M in ma ; 
whereas the Bessel factor canno “ae 
ion 1 Thus, from the point of view ot Pp 
extrapolation 1s excluded. : nieaiiensie? 
la actually displays a shght a é 
error bounds, Bessel’s formu | a ae 
i i tained is odd, in spite of the ta 
the highest difference to be re : ao ae 
i f information afforded by a 
ling’s formula then makes use of ne 
i any case, the Stirling formula is most 
ah a a a -—i < s < i, that is, for calculation between Xo — h/4 
ie) ’ ? = = 
h/4. _ 
ert BS eat of (4.5.5) and (4.6.5) indicates that, eet 
result of truncating the Bessel formula with a mean even pala ua 
use of more information than does the Stirling formula trunca : “i 
the corresponding ordinary even aint etary 3 ae oO ae 
ightly preferable from the point of vie’ 
sane es age i i to be retained is even. In an, 
bounds when the highest difference 0 a 

es the Bessel formula is most efficient (in general) near s a ie f 
1 “~¢ < 3, that is, for calculation between vo + h/4 and 1 — — 
i Ta gaaries of calculations based on a given set of data, it is aah. i 
to shift from one of these two formulas to the other, and one : ms 

ust be chosen. Given a set of data, a decision would wes ne “i 
ie the highest difference which was to be naan : gous a 

‘ Jing’s formula perhaps would be 
were of even order, Stirling’s nner 
: preferred. 
if i of odd order, Bessel’s formula mig) 
co afeenoe in gocumiey between the two formulas 1s usually small, 
that the choice is usually dictated by gan yaoi a 
i ticularly useful whe 

Everett’s first formula is par ¥ a 

certain even differences accompany the given data; Everett’s secon 


iif it is known (for example) that fom*) (x) is of constant sign in the relevant in 
it is 


i d. 
tage clearly is generally reverse ; id 
* us salseiliep of such generalizations 18 illustrated by a comparison of the 


of Probs. 18 and 19. 
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mula would be useful if its coefficients were tabulated and if auxiliary odd 
differences were available. 

To illustrate the use of these formulas, we consider the following differ- 
ence table, based on five-place data taken from a table of f(x) = sin a, 
where the differences are given in units of the fifth place and where the 


figures in parentheses are auxiliary mean central differences used in the 
calculations to be described: 





z f(z) A A? AS At AS 





1.0 | 0.84147 
4974 
1.1 | 0.89121 —891 
4083 —40 
1.2 | 0.93204 (8617.5)  —931 (-—36) 8 
(0.94780) 3152 (-—947)  —-32 (9) 2 
1.3 | 0.96356 — 963 10 
2189 —22 1 
1.4 | 0.98545 —985 11 
1204 -11 —3 
1.5 | 0.99749 —996 8 
208 —3 4 
1.6 | 0.99957 —999 12 
—791 9 
1.7 | 0.99166 —990 
—1781 
1.8 | 0.97385 


A convenient check on the differencing effected in any difference table 
consists in the fact that the sum of the entries in any column of differences 
should equal the difference between the last and first entries of the preceding 
column. To see that this is so, suppose that the entries in a certain 
wolumn, reading downward, are 1, U2, Us, ..., Ur Then the corre- 
sponding entries in the next column to the right are (wz — u1), (ws — U2), 
vey (Up — Ure), and (u, — Up-1), and the sum of these quantities 
evidently ‘telescopes’ into u, — uw. 

Because of the irregular fluctuation of the fifth differences in the given 
table, we would suppose that they are not significant but that they 
principally reflect the propagated effects of round-off errors present in the 
uiven data (see also §4.9). In fact, it would be suspected that the 
{\ictuation of the fourth differences about their mean value of about 10 is 
(lao principally due to these inherent errors in the original data. Thus, 
jiol more than the first four differences are to be used here. Whether 
thene differences are sufficient, and whether they are all needed, could be 


tletermined from the error term associated with the formula to be used, 
if knowledge of the analytical form of f(z) were presumed. 


———————<————  i-- | 
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PR Men een ensue ane both results being correct to five places. We see that both formulas 


would give results correct to five places if only third differences were 
retained. 7 

In a table providing 6? and 6‘, the entries used in the interpolation for 
x = 1.22 by Everett’s first formula would read 


In order to interpolate for f(l 
difference formula, with s = 0.2: 


f (1.02) ~ 0.84147 + 0.2(0.04974) + 
+ (0.2)(—0.8)(-1 8) (—0.00040) 


és (0.2) (—0.8)(—1.8)(=2-8) (9 99008) 
24 


0.2)(—08) (—0.00891) 





x f(z) 82 54 





= 0.84147 + 0.009948 + 0.000713 — 0.000019 — 0.000003 
= 0.852109 = 0.85211, 
which is correct to five places. 


The interpolation for f (1.75) wou 
backward-difference formula, with s 


2| 0.93204 -—931 8 
3 


ae 
1, 0.96356 —963 10 


ing lished by use of N ewton™ and the calculation would be of the form 
e accomplis 
= —0.5: 
(0.5) (0.5) (_ 9.90990) 
2 





J(1.22) = 0.8(0.93204) — 02) 081) (—0.00931) 


__ (1.2)(0.2)(—0.8)(—1.8)(—2.8) 
120 


+ 0.2(0.96356) + eee 08) (—0.00963) 
i: 2.2)(1.2)(02)(—08)(-18) ‘ears 


= 0.745632 + 0.000447 + 0.000001 
+ 0.192712 + 0.000308 + 0.000001 
= 0.939101 = 0.93910, 


f(1.75) ~ 0.97385 + (—0.5)(—0.01781) + 
(=0.5)(0.5)015) (2:5) (9,90012) 





(0.00008) 





—0.5) (0.5) (1.5) 09) + 
+ ee (0.00009) 


= 0.97385 + 0.008905 + (0.001238 — 0.000006 — 0.000005 


= 0.983982 = 0.98398, 
ing | f one unit in the fifth place. 
d value being in error by defect 0 the fi 
e eee eacieilate for f(1.22), we could use either Stirling : f oma 
or Beseel’s formula, with zo = 1.2 and s = 0.2 in either case. since 


is to terminate with 
uae aes 3 = 0 than s = 3), Stirling’s formula might be prefer 
ey inserting the mean odd differences indicated in parentheses 1n 


row x = 1.2 of the difference table, the use of Stirling’s formula gives 








in agreement with the preceding results. The additional computation 
here is because of the fact that Everett’s formula with fourth differ- 
ences actually incorporates the effects of the first five differences. In 
(his case, the retention of only the two second differences would have been 
wufficient. 

Since the analytical expression for f(x) is known, this situation could 
lave been predicted by reference to the error formula (4.7.2) which, with 
A = 0.1, s = 0.2, and m = 1, gives 


0.04 ,_ ; 
j(1.22) ~ 0.93204 + 0.2(0.036175) + “> ( 0.0093 : — 
0.04)(—0. 
4 (02)(— 0.98) (—0.00036) + 1 (0.00008 
= 0.93204 + 0.007235 — 0.000186 + 0.000012 — 0.000000} 
— 09391009 = 0.93910, 


? ‘ ‘ Ps 
whereas, after inserting appropriate mean eve 
the use of Bessel’s formula gives 


(1.22) ~ 0.94780 + (—0.3) (0.03152) + 
4 (0.2)(=08)(—04) (—0.00032) 


—0.96)(— 1.8) 
4, 02=0.96)(=18) 6, 





differences in the table 
0.2)(—0. = 1' 8) 2 
B= 10+ O2M=0.96)(—18) ey 


(0.208) (—0.00947) 1.44 X 10-*f*(é). 





Mince here f(z) = sin a, there follows |f'*(é)| < 1, so that (if no round-off 
errors were present) the error resulting from terminating Everett’s first 
formula with second differences would be less than two units in the sixth 
place. Similar error estimates could have been obtained, in advance, 


— 9,000003 -++ 0,000001 
2: 0,94780 — 0.009456 ++ 0.000758 ccvaeiad with reference to the other calculations, Moreover, the calculations are 


= 0.939100 & 0,98910, 
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considerably simplified when use is made of tabulated values 
i 4,12). “a 
ee for Dada differentiation and spudasmnaay aie se im 
i i i he interpolation 
i ‘ating and integrating any of t ; 
= ead 13). However, these formulas can be cciate 
ee systematically by operational methods, and their 
wha é 
i to the following chapter. = om 
i. uae ation of Inherent Errors. In addition to the cakes < 
eps rf h certain analytical expressions have been giv i: 
a aad the given data, and in the computa ig 
be taken into account. The latter generally can ee TN : 
must or one or more extra figures in the intermediate ca a _ a 
fae eae to investigate the way in which round-off errors 1 
us 
ffect the interpolation process. 
eee error in the interpolant, teens ne sae Saas 
i linear combination ol ‘dins 
saat Line Seiad When the interpolation Shee a 
na ans and is determined by exact fit to the given m7 at 2 Se : 
nae tants of combination are the Lagrange coe oe ae 
fee 3. In particular, if the error in each given A saa oa 
- rT . then the error in the interpolant cannot excee ° ae 
aa the sum of the absolute values of the relevant rare es 
es ant coefficients, corresponding to retention of sn : fon 
oe ee auiel to tenths in §3.4, whereas a similar table of ve 
are ae corresponding to retention of fourth ee a 
SL Ae Use of the latter table shows, for oe as ; re et 
‘aterpol i i between the thir 
i i t an abscissa midway , 
ee aeadteens due only to data errors not exceeding « In mM 
ver ' At , 
i ot exceed 1.4 in magnitu . — 
ee siting formula, when terminated with a mean odd ar 
' ie Bessel formula, when terminated eee A pee hes oe 
- i solati ials whic e 
tion polynomia ' 
not based on interpola 
seins and hence must be analyzed nied eee 
for avoiding the termination of the Stir ing a eee 
aa ie an differences, when precise error estimates ne es ; 2. 
ee resence of round-off errors in given data 1s sree a 
in seeds with the question as to ie eae aa . hs pag 
i i is r j | 
‘ned in an interpolation. For 
Sty ea prelate’ of the effects of such errors into the differen 
o stu 


themselves. ? aiead try is in error by an excess ¢, d 
Suppose first that a single initial entry entries are assumed to 


ing. ‘Then, if all other initial ; 
pans ar ies the effects of this error will be propagated into 
’ 


of the coeffi- 


errors, 
effects of round-off errors 1n 


inherent errors, 





Which occupies the row separating the maximum deviations. 
fooled value is 2.375. A fourth differencing would have given the entries 
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j11 
first five differences of the difference table as follows: 
f A A? A3 At AS 
Feed == ée 
— — é€ 
— e —5e 
_ e —4e 
e —3e 10e 
e —2e 6e 
—e 8e —10e 
_ e —4e 
_— —e 5e 


This characteristic distribution along a column, in which the successive 
crrors alternate in sign and, indeed, vary along the column of rth differ- 
ences as the binomial coefficients associated with (1 — x)’, frequently 
serves to permit one to discover and correct a gross error in a table. 


f A 





A? As 
1.203 18 
221 18 
1.424 36 
257 18 
1.681 54 
311 22 
1.992 76 
387 6 
2.379 82 
469 30 
2.848 112 
581 14 
3.429 126 
707 18 
4.136 144 


‘Thus, for example, the third differences in the accompanying table appear 
lo fluctuate irregularly. Their mean value is 18, and the successive 
(leviations from the mean, reading downward, are 0, 0, 4, —12, +12, —4, 

‘Thus an excess e = 4 in the last place is indicated in the entry 2.379, 


The cor- 


, 4, ~16, 24, —16, 4, from which the same conclusion would be drawn. 


hen several errors are present, their discovery may be much more 
iMeult, 





<< lll 
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Suppose now that all initial entries may be in error by amounts between 
—e and e. The most unfavorable situation, with regard to effects on 
differences, is that in which the successive errors are as large as possible 
but are of alternating sign. The error-propagation table, through fourth 
differences, then appears as follows: 


INTRODUCTION TO 


f A a? ~ sf 

| ee ae See 

; Ae 16e 
2 8e 

22, de —16e 
2e —8e 

‘ = 16e 
_2¢ 8e 

= 4G —16e 
2¢ —8e 

; ge 16e 


Thus, it follows that errors varying between —e and e in the initial da 
will lead to errors varying between _9re and 2re in the rth difference 
Here, for example, if the initial data are correctly rounded to k deci 
places, e = 5 X 10-*-1. 

Because of this possible error growth, it usually happens in practi 
that calculated differences beyond a certain order are no longer significam 
That is, there exists a certain “noise level’’ such that the effects of initi 
round-offs are of the same order of magnitude as the differences whi 
would have been obtained had the initial data been exact. If the initit 
data are rounded, from exactly known data, to k decimal places, th 
round-off errors of magnitude Qr-1/10* are possible in the rth differen 
Hence rth differences of magnitude appreciably smaller than 2’-1/ 
are likely to consist largely of “noise.” Thus, since k = 5 in the d 
used for the examples of the preceding section, “noise” of magnitude 
2, 4, 8, and 16 units in the fifth place could occur in the respective diff 
ences of order one through five, although the probability of noise of ne 
maximum magnitude in the rth difference is clearly small and will deer 
rapidly as 7 increases. In any case, it would be expected that, since 
fifth differences in that table are small relative to the permissible n 
they are completely meaningless, so that the fluctuation of the fou 
differences about their mean may also lack significance, in the sense 
the replacement of those differences by their mean value would lead 
errors in interpolation of the same order as the errors which are pr 
in the given data. 

4.10. Throwback Techniques. A useful procedure, due to Co 
frequently permits & computer effectively to take into account a negl 
difference by modifying certain of the differences actually retained i 
interpolation formula, 

































and calculation shows that, when 0 < 
Mugnitude than 0.00122, wher Mis 
Hence, if 6*fo and 64f, : 0 


feluined in the over. 
-all calculation, the 
P error committed in the thr 
Ow- 


hack cannot 

exceed § unit in that i 

place; if 54 

Men, then 1000 units are permissible (see en. See 


#'/, is replaced by sf, 
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n ulustration, Everett’s fi | 

. ; rst form i i 

ences, can be expressed in the form aN i 
a 


an. (4.10.1) 


f= (1 ~ fo + afr — B= VE = 2) a ¢ EYEE 2 
20 


(s+ 1 _— 
+ ot Dee a 1) en 4. (s + 2)(s — 2) 
| | 20 
n the interval 0 < s < 1, the factors 


(s + 1)(s — 8) 


if (s + 2)(s — 2) 


20 
3 
—z0 to —34. This f 
Ras Pan is fact suggests that 
ae pean : pel value over that interval in fo 
y Comrie, —0.184, differs only slightly from the m ms 
ean 


value (—#%g) of each f i 
ae actor.f Hence, if we define the modified second 


and 


both vary only from 
be 


sf, = 


5%, — 0.1845%f;, (4.10.2) 


Nverett’s form wi i y 
ula th fourth differences may be approximated by the 


formula 


I, ° l - = = 
( 8)fo + sf — se = De = 2) 52fy + et Dee — 0 Bf. 


rhe error associated with the introduction of 


Ai this approximation is 


[(° 5 1) + o84(8)] oy + ei ere 
Jens [(042) +0n06(*t )] oy 


3 


(4.10.4) 
= 1, this sum is not larger in 


the larger of |6*f, 
and |64 
do not exceed 400 units in the last Hee bk 


If the same throwba k (4.10.2) i ff din B ormula, so tha 
thr C. (4.10.2) is e ected i essel’s for la, so t 

and differences bey ond the third are omitted oe 

y 


1 The figure (4/3 
2 + 3)/24 = 0.184 i 
lor for which the magnitude of the palma ia (Sober baal 


of Bessel's formula. The same fi 


error due to throwback is least i 
ast in th 
‘ula (see also Probs, 23 and 24), ert 


gure is conventionally used with the Everett 
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i t, in 
i i rly taken into account, 
itted fourth difference 1s proper! ee 
ane yen : it does not exceed 1000 units in the last decimal pla 
the same 8 ; 

: i i or more 
se lized techniques, relevant to higher case i ee ae 
si es have also been devised by Comrie. eee ar 

irling’ a. 
ben even eon oe as et sae « ere been made to the 
Jation Series. lelerence’ ee 
rn eae les olation formulas considered in this ee ee 7 
eae ae the number 7 of differences retaine : i ee 
; ; A 7 
. ae teat ‘while the spacing h 1s held fixed. ar ee ae 
“ anle example which illustrates this fact, and sta 
sider a 81 
i das an 
— Newton forward-difference formula (4.3.2) were considere 
i . 
infinite series, it would be expressed in the form 


AO) , 1... 
“one 


Af(0) 


f(z) =f) + a eee) 





= f(0) + S200 ne —h)- + @ RH IR), NM 
k=l 


h = 0 
? ou. be ex Tess. ble in t 


x 2 oreh s 
gta) = 400) += | wf + 3p (0) | + 


. x 2k 
-10 +) ea [nes + orn! 500) | 


k=1 


» [x(a? — h?)(@? — 4h?) + °° & — 12h)], (4.11 


imilar i lati 

ond to similar interpo 
ining formulas would correspo ar 

ait - Se uae problall considered here 1s that of determining 
ae series converges to the generating function f(x). 


In the special case when 


(4.11 
f@) =e, 


there follows 


(ea — Vf), "9 Arf(z) = (e" — 1)f 


aj(a) = eet) — oF = 


and hence also 


Arf 0) = (es — 1)". (4.1 


" ined in 
Thus the formal Newton interpolation series for ew may be obtained) 


th « 
ah 

When «x > —h, 
fonvergence is indeed to e*. 
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form 


wont 4% » (oo tele Sparta @e koi. Gis) 
k=1 


This series terminates, and represents e* correctly, when z is zero or a 
positive integral multiple of h. In order to investigate its convergence 


when the series is infinite, we notice first that the ratio of successive terms 
in this series is given by 


eh — ] 
he +1) @ — 
and that, as k > , this ratio tends to —(e* — 1), for all values of z. 


‘Thus we may deduce that the series (4.11.5) converges if |e* — 1| < 1 or 
os" < 2, whereas, if c # 0,h, 2h, . . 
eh SS 2. 


When e** = 2, the series reduces to 


. , it diverges when |e — 1| > 1 or 


= a(t —h)-:- (@ —k-— Ith) 


kth* 


Sie Ss a ee 
=1+ al k eo k-7—2 — 5) 
k=1 
the successive terms of which alternate in sign when k is sufficiently large. 


Now the kth term can be written, in terms of the gamma function, in 
the form 


1+ 
kel 





T(k — x/h) 

—1)k Lad 

(—1! Fane 

ly making use of the fact that T'(k + u) is approximated by k!k«-}, for 
large k, we find that this last ratio is approximated by 

k-2/h-1 


eh eeeY) 9) 


when k is large, and hence that it tends to zero as k > © if and only if 
i> 


-h. 


It, follows that the series (4.11.5) converges for all finite values of x if 
log 2 and diverges for all values of x which differ from 0,h, 2h, .. . if 

> log 2, and that, if ah = log 2, the series converges when and only 
It can be proved that, when the series converges, the 


If w is considered as a complex variable (but a is real), the preceding 


ilevelopments are unchanged except for the fact that, when ah = log 2, 


® region of convergence is that half of the complex 2 plane for which 
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n—h, If also ais complex, the conditions In the case of the Newton series, with forward differences, it is known 


that, if the series converges to f(x) for any value of x, say Z, in addition to 
x =0,h, 2h, ... , for which it terminates, then it converges for all 
values of x such that the real part of x is greater than the real part 
of Z Unless f(z) is analytic in some half plane Re(x) > a, and also 
\f(re!®)| < Me* for large r, where M and a are constants such that 
ah < 2/2, the series will diverge except when it terminates. If f(x) is 
analytic in such a half plane and if also |f(re’*)| < Me* for large r, where 
M and a are constants such that ah < log 2, then the series will converge 
everywhere in that half plane. (For proofs of these statements, see 
Norlund [18, 168].) 

Thus, for example, the function f(z) = 1/(1 + 2’) is analytic when x 

is real, but it possesses poles when x = +7. Hence the Stirling series 
will diverge for all values of x. Since this function is analytic in the half 
plane x > 0, and since it is dominated in magnitude by any exponential 
function Me" (a > 0) asr— ~, the Newton series will converge in that 
half plane. Nevertheless, if both series are launched from the same 
point, the error in the Stirling series will at first decrease much more 
rapidly than that associated with the Newton series, as additional terms 
are incorporated into the calculation. Eventually, the result of adding 
still more terms to the Stirling series will increase its error, whereas the 
error in the Newton series will continue to decrease. However, this 
point of diminishing return in the Stirling formula is likely to be preceded 
either by a stage at which the truncation error has decreased below the 
tolerance imposed or by a stage at which the “‘noise level” is reached, so 
that the effects of round-off errors would cause the remaining higher differ- 
ences to be undependable in any case. 

Thus, as in many other practical situations, it is quite possible to 
obtain more accurate results by terminating an ultimately divergent proc- 
ous at an appropriate stage than by terminating a convergent process at a 
corresponding stage. 

It is evident that, since each partial sum of either the Newton or Stirling 
series represents a polynomial approximation to f(x) corresponding to 


i ha 
the real part of x is greater t . 
h ah 1| ¢ 1. 
a $ og 2. must be replaced by le Ss ; 
ae therefore that, if the Ne wton forward-difference formula were 

’ ) 


i i than 
used for interpolating ¢* where a > 0, with a ear 
(log 2)/a, the successive interpolates corresponding i ae 
of more and more data would eventually begin to oscillate ws aie 
amplitude about the true value. Thus, whereas the retent 


i h 
additional term of the interpolation formula would generally improve the 


F til 
accuracy of the interpolation up to a certain stage, there would exist 


oint beyond which additional terms would correspond to ss . A ae 
(A similar situation was encountered in §3.7.) For = eae ce 
ular, for negative a), this situation would not arise. 4n me 
wale when ah = log 2, convergence would follow if and only : 
mula were not used for backward caakn eee ae - 
i kable in view 
e results are particularly remar 
is a a well-behaved function that its Taylor series converges for 
complex) values of «. - 
ar eae pees in the case when Stirling’s formula is used inste 
. . t 
he fact that here the corresponding int : 
ee Shean ah < 2 log (A + 4/2) and diverges wer it does n 
eating for all a in all other cases, including the case when 


ah = 2 log (1 + V2). 


the corresponding condition is |sinh (ah/2)\ <i. a 
In the general case, it is known that, if the Stirling ae seal : 
any value of © in addition to x = 0, th, +2h, . - oo 
minates), then it converges for all finite values 0 eae 
Hence, conversely, if it diverges for any finite fe yan - oa 
always unless it terminates. In the language of the : ri 
of a complex variable, the Stirling series cannot converge ) “A 
x (except those for which it Leaner sree . oipaihices —- 
} i tion which is analytl 
ee Hee a a aintly, a function whose Taylor series conve 


Tf a is complex, 


ry where. 
oy But even though this be the case, the series stil 


i i le. where f(x) = €%*)- 
as in the preceding example, 
nee \f(re'®)| < Me*" for large 7, where M and a are cons 
ah <7. If these conditions are 
diverge every where except where it terminates. 


| may not conv 
It is also neces 
tants such 


e not satisfied, the Stirling series 
On the other hand, 


vollocation at the points involved, the two sequences of approximations 
differ only in that the former results from the successive introduction of 


the ordinates at x = 0, h, 2h, ..., kh, . .., all of which lie on the 
half line 0 S x < ©, whereas the latter successively introduces the ordi- 
nates at the points: ::, —kh,..., —h,0,h,...,kh,..., in 


wich a way that symmetry is preserved about « = 0. Thus, the con- 







f(a) is an entire function, and if Mh “age 5 oh 5 pa at ACiabeee 
constants such that ah < 2 log ; t : 
ene everywhere. Similar statements apply to the series associated 


Bessel’s formula, 


vergence or divergence of the sequence of approximations truly depends 
tijpon the sequence of data introduced, rather than upon the type of 
polynomial interpolation formula employed, Whereas an indication of 
ihe existence of an unfavorable situation is usually afforded by an inspec- 





———————————— i". 


118 INTRODUCTION TO NUMERICAL ANALYSIS 


tion of a relevant difference table, such numerical evidence is not avail- 


able when Lagrangian methods are used. 
4.12. Tables of Interpolation Coefficients. 














This section provides brief 
tables of coefficients relevant to the interpolation formulas which have 
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Pore ae belay INTERPOLATION 

s ~Jo 8 0 2(s) A*fo + C3(s) A®fo + C. (s) A4 C 

Ins ~ fn — 8 Vfu + C2(s) V¥fn — Ca(8) V'fn + Cus) me Gil oY 
(s positive for interpolation) 7? 
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been considered. For more elaborate tables, the references cited in §4.13 ‘ Ca(s) é 
should be consulted. . a(s) C4(s) C:(s) 
LAGRANGE Five-poIntT INTERPOLATION —1.0;/ 1.00000 —1.00000 1.00000 —1.00000 
(see §3.4 for three-point coefficients) 9 
f. = L-als)f2 + Lale)fa + Loledfo + Lala)fi + Lal)ls ee, eens Se ee ee 
for negative s, use lower column labels . -72000 —.67200 .63840 —.61286 
—.7| .59500 —.53550 49534 — 46562 
: Las) b=) Lo(s) Lis) Eats) ~.6| 48000 —.41600 37440 —.34445 
ee _ a? % ea 
.5| 37500 —.31250 .27344  — 24609 
0 900000 .000000 1.000000 000000 .000000 0 —.4]  .28000 —.22400 .19040  —.16755 
—~ 3] 19500 —.14950 12834 + —. 10607 
A 07838 -—.059850 987525 .073150 — . 008668 a —2| 12000 —.08800 .07040 —. 
2 “o14400 —.105600 .950400 . 158400 — .017600 =a ~ i] ‘05500 03860 “omuea gear 
3 019338 —.136850 880525 254150 — .026163 -# ines 
A 022400 . 153600 -806400 .358400 — .033600 —.4 0 . 00000 00000 
5 oo3438 156250 703125 «468750 —.039063 | —-8 woe. Saran 
6 022400 —.145600 582400 .582400 — .041600 —.6 .1 | —.04500 .02850 —.02066 
a 19338 —.122850 447525 696150 — 040168 -.7 -2| —.08000  .04800 —. 03360 pen 
8 “p14400 —.089600 — .302400 .806400 — .033600 —.8 -3| —.10500 05950 —.04016 pas 
9 007838 —.047850 151525 =. 909150 — .020663 -.9 .4 | —.12000 .06400 — 04160 pee 
5 | —.12500 06250 —.03906 02734 
1.0 900000 000000 .000000 1.000000 .000000 | = 0 ‘6 | —.12000 .05600 —.03360 —. 02285 
-7| —.10500 04550 —.02616 —-.01727 
11 | —.008663 051150 —.146475 1.074150 029838 A 8 | —.08000 03200 —.01760 
1.2 "17600 102400 —.281600 1.126400 070400 2 -9| —.04500 01650 —. 00866 pees: 
13 | — 1026163 150150 —.308475 1.151150. 128388 3 ined 
1.4 — .033600 “190400 —.489600 1.142400 190400 4 1.0 00000 00000 
is | —.039063 218750 —.546875 1.093750. 273488 5 ii 
1.6 — :041600 230400 —.561600 .998400 .374400 6 
1.7 — .040163 220150 —.524475 . 849150 . 495338 a s 
1.8 — .033600 182400 —.425600 . 638400 . 638400 8 fo * fo + 8 df ee SE eSEe ON 
1.9 920863 «111150 —.254475 358150. 805838 9 lias 2(8) 5%fo + Ca(s) u5%fo + Ci(s) 84fo 
$ C2(s) C3(s | 
2.0 .000000 .000000 .000000  .000000 1.000000 ss — . 
0 | .00000 — .00000 0 pe an 
: .00000 
L2(s) Ly(s) Ly(s) L_i(s) L_2(s) “1 .00500 —.01650t —.00041 = : 
Nore: All coefficients become exact if each terminal 8 is replaced by 75, and 2 .02000 —.032007 —.00160 26 
terminal 3 by 25. 3 .04500 —.04550f —.00341 a 3 
; 4 | .08000 ~.05600f ~.00560 | —.4 
+ The sequences of interpolation polynomials considered here correspond to b -12500 —.06250t —.00781 Sok 
incorporation of successive ordinates ‘which eventually are at unboundedly incre ‘8 -18000 —.06400t —.00960 6 
distances from the point of interpolation, Whereas the sequence generated by fit 4 .24500 —.05950t —.01041 ie 7 
ordinates at points which divide a fixed finite interval (a,b) into n equal parts 8 -82000 —.04800f —.00960 “8 
allowing n to increase without limit generally is more tractable, there actually 9 .40500 —.02850t —.00641 = 9 
1.0 .50000 —. 00000 .00000 ~1.0 





functions which are continuous in (a,b) but for which this sequence diverges ever 
in (a,b), 


1 Change sign when reading ¢ from right-hand column, 
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4.18. Supplementary References. Davis [246] includes tables of the 
interpolation coefficients relevant to the first six differences in the for- 
mulas of Newton, Stirling, Bessel, and Everett with spacing 0.01 over 


BrssEL INTERPOLATION 
fe ~ ufy + (8 — 4) fy + Cals) ud*%fy + C(s) 854 + Cals) wb4f, + Cs(s) 5°fy 








, C2(8) C;(s) C4(s) C5(s) 8 (0,1), to 11 places, together with tables of corresponding coefficients for 

ere numerical differentiation. Other tabulations are listed in the Index of 
0 .00000  .00000 —.00000 pelsee ar Mathematical Tables [242]. Davis also lists formulas which provide 
A —.04500 .006004 a - peed '8 approximations to the results of inverting truncated Newton and Everett 
F ren pean (01934 — 000774 2 formulas and which are useful for inverse interpolation in some cases, 
‘ . 12000 “00400; .02240 —.00045t 6 although the use of one of the methods outlined in §2.8 (see also Prob. 26) 
5 —.12500 .00000 .02344 .00000 5 is often to be preferred. Salzer [258, 260] gives tables of the relevant 


+ Change sign when reading s from right-hand column. nachictenk Tanclinnas. MLAIe [151] vad Oe eh a 


of difference tables in the detection of errors. See Comrie [244, 245] and 
Lidstone [134] for additional throwback techniques. Interpolation in 
two-way tables can be effected by first applying one of the formulas 
treated in this chapter (or in Chaps. 2 and 3) to interpolate with respect 
to one variable and then using another formula (or the same formula, if 


EverETr INTERPOLATION 
, = (1 — 8)fo + Ca(s) 5°%fo + Cals) 8*fo 
: vee rare) —s) &f; + Cad — s) 54fy 


8 C2(8) C,(s) it is appropriate) to interpolate the results with respect to the second 
oo ns ana. SRNIOA variable (see Prob. 20). Formulas essentially permitting the two proc- 
: Bose "00455 esses to be effected at the same time are considered in Probs. 55 to 58 of 
- — 04800 . 00806 Chap. 5 and are treated in detail by Pearson [180]. See also Steffensen 
.3 | —.05950 .01044 |18] and Willers [21]. The two latter references include some correspond- 
.4| —.06400 .01165 ing formulas for two-way integration (cubature). See also Probs. 53 and 
.5 | —.06250 pak 4 of Chap. 5 and the detailed treatments of Irwin [121] and Radon [189]. 
- eit HOR ‘or discussions of the convergence of sequences of polynomial approx- 
“g | —.03200 .00634 ijmations, and related matters, see Noérlund [13, 168], Bernstein [35], 
.9 | —.01650 .00329 Jackson [123], Feldheim [75], and de la Vallée Poussin [229]. 
1.0 .00000 .00000 





PROBLEMS 
Nection 4.2 


1. Show that 


SrEFFENSEN INTERPOLATION 
fe = fo + Carls) af, + Cals) Oy : 
eee ae A'fe = Vifirr = Ofi++/% Vific = Afar = 8fi—r/2 








s§ Ci(s) Cs(s) of, _ A’fe—r/2 Se V" fete 24 
7 A(fe-1 Age-1) = V(fe Age) = A(fr-1 Vox) = Vfe Von+1), 
—.5| —.12500  .02344 pee eee 
—4| —.12000 02240 @. Hkow that 
~ 3) = on009 ‘O1ado A(fige) = fr Age + gust Mfr, A(fz) = (fr + fest) Aft, 
—.2| —.08000 “00784 A i) — 9 Afe = fe Age a(t) _ _ Ate. 
—.1| —.04500 f was : A 
4. Show that 
0 .00000 .00000 . () : ea 
1] .05500 + —,00866 2)” ae FW) = FA 
'9| .12000 —.01760 4 cos (we + a) = 28in S* cos (wr +a +242) 
3 19500 = 02616 
4] 28000 - os Ar cos (wat + a) = (2sin +) cos (o2 +a +75" +5 
5 87600 = 08 
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Section 4.3 
4. Calculate approximate values of f(z) = sin x for 2 = 0.50(0.02)0.70 and for 
1.50 (0.02) 1.70, by applying the appropriate Newtonian formula to the following 


rounded data: 


gz | 0.5 0.7 0.9 1.1 1.3 1.5 1.7% 
f(x) 0.47943 0.64422 0.78333 0.89121 0.96356 0.99749 0.99166 


5. Obtain the formulas 


+ 4(2s — 1) A%fo + 13s? — 68 + 2) A*fo + qs(2s* — 9s? + 11s — 3) A“fo 


hf, = Afo 
+ pig(st — 408° + 105s? — 100s + 24) Ao + °° * 


and 


1 feotsh 1 1 1 
i 4. fla) dx = sfo + 452 Afo + 1987(28 — 3) A%fo + ax8%(s — 2)? Afo 
+ ige?(6s' — 458% + 1108 — 90) A*fo 
+ rdeyst(2st — 24s* + 1058? — 200s + 144) Af +s 


and also obtain corresponding formulas for hfi,, and for h7! I th ‘ f(x) dx in terms of 
Zn—8 


backward differences. 
6. Use the data of Prob. 4 and the results of Prob. 5 to obtain approximate values of 


f'0.6), f(1-6), f" (0-8), f’'(1.6), and of 


pe a 1.7 4 
0.5 f(z) om, ie fie) om 
Section 4.4 


7. Calculate approximate values of f(1.0) 
Gauss’s forward formula launched from % = 
formula launched from z = 1.1. 

8. By specializing Sheppard’s rules for the forma 
the case when the relevant abscissas are at a uniform spacing 
cient of the kth difference encountered in a continuous difference path can be obtained 
dividing by k! the product of k factors, each of which represents the distance between 
abscissa of the interpolant and one of the abscissas lying in the region of determinatt 
of the preceding difference in the path, in units of the spacing, if the result of truncatii 
the interpolation formula with the kth difference is to yield exact results at all poin 
involved in its formation. Also, illustrate the use of this rule by writing down t 
forward and backward formulas of Newton and Gauss. 

9. Show that the result of truncating the Gauss forward formula with the fourtl 


difference can be written in the form 
- —2 
fs ~fo +e {an 4257 er +2t+ apy +52 oe) |} 


ion of the formula is conveniently effected from right to left, 


where the evaluat 
write the backward formula, as well as the two Newton formulas, in similar forms, 


from the data of Prob. 4, first by use 
0.9, and second by use of the backw: 


tion of interpolation formulas 
h, show that the coe 


Section 4.5 
10. Use Stirling's formula to caloulate approximate values of f(x) for 
@ @ 1,00(0,02) 1,20 
trom the data of Prob, 4, 
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11. Obtain the formulas 


123 


hf, = pofo + 8 8%fo + %(3s? — 1) wd%Fo + qys(2s? — 1) d4f 
0 


1 
oe + ryo(5s4 — 15s? + 4) ud%fo + yaqs(3st — 10s? + 4) 8% + 
fae ree 


1 jokes 
i Jeeta F%) 8 = 2sfo + Gs" Bo + raD8"(Bs? — 5) HY 
+ rsa08"(3st — 216? 
and A aba eae 
use them to calculate approximate values of f’(1.1), f’(1.0), f’’"(1.1), f’”(1.0) 
»f'(1.0), f'"(1.1), f’""(1.0), and of 


lips 
1.0 f(@) da, 


1.3 
f(z) d 
from the data of Prob. 4. I ty 


Section 4.6 


12. U: ( 
se Bessel’s formula to calculate approximate values of f(x) for 


from the data of Prob. 4. z = 0.90(0.02)1.10 


13. Obtain the formulas 


Wf’ = 
Weg = Ofy +t dy + Dy(12 — 1) 5%, + pAyt(4e — 5) nay; 
ne 4 


and + zpiry(80t — 120¢2 + 9) d5fy + +> - 


| [ry +th 


Migcwr = 2t wfy + Yat(4? — 3) ud%fy 


1 
+ seeot(48d4 — 200d? + 135) ud4fy + + - - 


where u4= (xo a)/ em 
} + and use th i 
dee Pay Ae to calculate approximate values of f’(1.1), 


’ 








1.1 1.3 
PM fee)dz, fi) He) de, 
from the data of Prob. 4. , 
Nection 4.7 
14. Use Everett’s first f 
ormula to obtai i 
# = 1.00(0.02)1.20 from the following aes PE an rene eee are 
x f(x) 5? 64 
0.9 | 0.78333 —3123 125 
1.1 | 0.89121 —3553 141 
, 1.3 | 0.96356 —38842 155 
. Use E : 
verett’s second formula to obtain approximate values of f(z) in x f 
= sin x for 


» » 1,00(0.02)1.20 from the following data: 





* J(@) 58 
0.9 | 0.78333 
—430 
1,1 | 0.89121 
— 289 
1,3 | 0.96356 
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16. By integrating Everett’s first formula, obtain the formula 
1 [7 
h £ f(x) da = why — qtgud%fy + TroHdSs — soreorehy + 


and verify that it follows also from the result of Prob. 18. Show that it can be 
expressed in the alternative form 


a f(z) dx = g(fo +f) — gs (Hifi — Hafo) + tao (user — #5*fo) 
— heb (uatfi — Ho) +” 


and deduce Gauss’s sum formula (see also §5.9) in the form 


i [.1@ dx = h(gfo + hi t+fetoc' fps 43h) ee uf) 
+ Paty (ud%fe — wd%fo) — were (Hees, — pdr) bt 
Section 4.8 


17. Sketch the function (x) = x(a — 1)(@ — 2)(« — 3)\(@ — 4) over the range 
-1s27 855. Noticing that the error associated with the approximation of f(z) by 
the result of retaining fourth differences in either of Newton’s or Gauss’s interpolation 
formulas is of the form 1(a)fv(e) /120 for some &, if fv() is continuous, when the 
ordinates at « = 0,1, 2, 3, and 4 are employed, and assuming that Newton’s formulas 
would be used principally in (0,1) and (3,4); whereas central-difference formulas would 
be used principally in (1,3), account for the fact that the former are sometimes 
erroneously said to be “Jess accurate” than the latter. If interpolations were effected, 
in the interval (1,3) by both Newton and Gauss formulas, based on the five ordinates 
at « = 0, 1, 2, 3; and 4, and if no round-offs were committed, how would the results 
actually compare in accuracy? What evidence is afforded by the graph with respect 
to the general relative dependability of interpolation and extrapolation? 

18. Iff(z) = G+ x)5, determine the Stirling and Bessel approximations over (0,1 
corresponding to @ spacing h = 1, with zo = 0 and 1 = 1, and corresponding to t 
successive retention of differences through the first, second, third, and fourth. The 
calculate the error in each of these eight approximations for z = 0.0(0.2)1.0, retaini 
only one decimal place, and plot the error curves in a common graph. 
that, in this example, the following facts are true over (0.1): 

(a) The Stirling mean first-difference approximation is better than that which al 
incorporates the second difference over most of the range, and the mean third-diff 
ence approximation is better than that which also incorporates the fourth differen 
over the entire range. 

(b) The Bessel mean second-difference approximation is better than that which 
incorporates the third difference over half the range. 

(c) The Stirling mean first-difference approximation is better than the three Be 
approximations which employ the first difference, the mean second difference, and / 
the third difference, near v = 2 as well as near z = 0. 

(d) The Bessel fourth-difference approximation is much better than all 
and is followed successively by the Stirling third-difference and the Stirling four 
difference approximations. 

(Compare the results of Prob. 19.) 

19. Proceed as in Prob. 18 with f(z) = os (wx/8), retaining five decimal plac 
Thus show that, in this example, the following facts are true over (0,1): 

(a) The Stirling yerothedifference approximation is better than the Bessel 
difference approximation over half the range, whereas the Bessel mean zoroth-diff 
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ence approximation is better than the Stirli i 
iat pcan pain Cree irling mean first-difference approximation 


(b) The Bessel mean second-di 
-difference a: i ion i i 
incorporates the third difference over half a peariael a eaeaty 


(c) The Stirling second-diffe i 
: ‘ = rence approximation i : 
difference approximation over most of the ee is better than the Bessel third- 


d ge : : 
(d) The Stirling fourth-difference approximation is better than all the others and is 


followed successivel. P 
y by the Bessel f fe ee 
approximations. ourth-difference and the Stirling second-difference 


a tee the results of Prob. 18.) 
. The following data represent rounded four-place values of the elliptic-integral 


: y 
function E(2,y) = f V1 — sin? z sin? ¢ dt: 





<< 50° 54° 58° 62° 

y 

50° | 0.8134 0.8060 0.7988 0.7920 
52° | 0.8414 0.8332 0.8251 0.8174 
54° | 0.8690 0.8598 0.8508 0.8422 
56° | 0.8962 0.8859 0.8759 0.8663 


Determi i i °51° i 

rapes ae = yarns to E(52 51 ) by (a) interpolating horizontally to obtain 
ee ae oe ae ee ae these values vertically, (b) inter- 
ntally, and (c) interpolating directly al i 

Also interpolate as accurately as possible for H(55.4°,53.1°) by eu anid ee 


Section 4.9 


21. C i 
Eri a . Paeaelne table, corresponding to the results of rounding true 
ee heat or x = 1.0(0.1)3.0 to two decimal places, and study the propa- 
Teeter ee eee errors. Also compare the mean absolute values of the 
pclae oe = dpeaet spear itee with the ideal values, and show that a more regular 
, result from / i 
ne Pe. ae ee aoe ie improper rounding of the values corresponding 
. Certain of the following 20 consecutive values, corresponding to equally spaced 


arguments i i i 
, are incorrect because of typical copying errors. Locate the errors and 


eorrect them. 
17278 48818 79779 112630 
23424 54440 86249 119398 
29585 60723 92752 126246 
35764 67041 99318 133180 
41964 73398 105937 140206 


Nection 4.10 


93. 8 iti i 
sing ee io sci ered R(s) introduced into Bessel’s formula, by 
sam r ;) and neglecting 4é‘f; otherwise, can be expressed in the 
R(s) = gl(s? — 8)? + (12k — 2)(s? — 8)] wd%fy, 


and that t i 
at the extreme values of the coefficient of »é‘f; for 0 < s S$ 1 occur whens = + 


and when s? —s8 = 1 — 6k and are gi 
Li e given by (3 — 16k)/128 and —(1 — 2 
foapoctively. Show also that the requirement that the extreme ae be 7 fle 


" i i i 
jagnitude and opposite sign (so that the maximum additional error is minimized) 





—— Eo i". 
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gives k = (3 + +/2)/24 = 0.184, and that R(s) then varies between the limits 
+(3 — 2 +/2) pdtf,/384 = +0.00045,54f3. 

24. Show that the additional error R(s) introduced into Everett’s first formula, 
by replacing 5*fo and 8%f, by 8%fo — k df, and df, — k 54f; and neglecting fourth 
differences otherwise, is identical with that associated with Bessel’s formula when 
54fo = 54f;, and hence deduce that, if k is assigned the same value as for Bessel’s 
formula (Prob. 23), then the additional error cannot exceed 0.00045 times the larger 
of |54fo| and \64f,| if the two fourth differences have the same sign. Show also that, if 
those differences are equal in magnitude and of opposite sign, then R(s) is given by 


Ri) = 2s [s ee. 4 5k Gi fo 


= treeolt® — 1008 + 9 + 80K() — t)] 84fo 
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(t = 2s — 1), 


and show that the maximum additional error for0 Ss $1 is smaller than 0.00122] 5*fo| 
in magnitude. Thus deduce that |R(s)| < 0.00122M in the general case, where M 
is the larger of |d4fo| and \84f1|. 

25. Solve Prob. 14 by using the throwback technique. 

26. Show that Everett’s modified second-difference formula can be expressed in the 
form 

pele eee ay} 
fri — fo 6 

for inverse interpolation between fo and fi, when f, is given. Use it iteratively, first 
replacing s by zero in the coefficients in the right-hand member to calculate an initial 
approximation to the required value of s, and then successively introducing each new 
approximation into the right-hand member to obtain the next one, to determine 
approximately the value of z for which f(z) = 0.9, with the data of Prob. 14. 


Section 4.11 


27. Show that the error corresponding to the truncation of the series in (4.11.5) 
with the nth term is of the form 





E,(x) = (a —h)- ++ @— nh)ertn, 


antl 
@+i)!” 
for some £n, and deduce that, if 2 < nh, the errors E,(x) and En+i(z) are of opposi 
sign, so that the error is smaller than the first term omitted and of the same sign (see 
Prob. 5 of Chap. 1). Under the assumption that e*h > 2, so that (4.11.5) diverges 
show also that the term corresponding to the (k + 1)th difference is smaller thi 
the preceding one so long as k does not exceed ko, where ko is the integral part of 
[(e — l)z + hj/[h(es® — 2)]. 

98. Illustrate the results of Prob. 27 by calculating successive approximations 
e from successive partial sums of the Newton interpolation series (4.11.5) witha = 1 
and h = 1, when ¢ = 0.5. In particular, show that the best approximation 
afforded by retention of only two differences, that a consideration of the first neglect 
term gives the result 1.49009 < e°5 < 1.80717, and that the mean of these limi 
differs from the true value by less than one unit in the fourth decimal place. 

29. If f(x) = e%*, show that 


sry(0) = (2 sinh o): yarrriy(0) = (2 sinh a) sinh ah, 
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and deduce that the formal Stirling series centered at x = 0 is of the form 


gan, ae 2 
e = 1 +5 sinh ah + si (2 sinn FY" 


(xv? — h*) (9 ay GRY? gi 2(a2 — h? 
31h3 (2 sinh o) sinh a + PME rae any Sb ants 


+ 
30. Calculate six successive a i i 
. C pproximations to the value of e* 

: i of e* when x = 0. 
apres : formula centered at x = 0, with h = 1, and investigate the fae a ie 
; e deviations from the true value. (Notice that the infinite Stirli i 
itself is convergent in this case.) nee 


31. By successt ely e he even nd ie rt wo embers of the 
Vv quating t a! dd parts of the 
t m 
expansion of T rob. 29, and taking h= 1, deduce the formal expansions 


cosh az = 1+ x + x?(x? — 1%) at + x2(x? — 12)(22 — 22) ii 











4! eG, axe 
and 6! 
sinh az x(x? — 12 = 
aha * ‘ 31 UF aes Le — 2) gc gs. a b_Sinh a 





~ B cosh (a/2) 


where 8 = 2 sinh (2/2). Show also that these series converge when |8| < 2 


32. Show that the formal Bessel-seri i 
tan orsen series representation of f(z) = e%*, centered at 


2- 3th 


eee = eah/2 nae 2x —h x(x — h) h 
[eos 2 - 2h 8 2th? B® cosh 
4 2e-WMCt—-May... i 


where B = 2 sinh (ah/2). Also, b i 
: , by taking h = 1, replacing 2 by x + 4 
alvely equating the even and odd parts of the result, deduce ee ae) 7 





cosh ax g2 —4 1 
ae t (x? — =)(a? — 3 
cosh (a/2) as 2! a+ woe Bi+--: 
and ; 
1 
sinh az = 2B + ee pa + x(c — 4)(@* ~ 4) 5, + B sinh az 
' 5! ~ 2 sinh (a/2). 
where 6 = 2 sinh (a/2), and show that these series converge when |g| < 2 


CHAPTER 5 


OPERATIONS WITH FINITE DIFFERENCES 


5.1. Introduction. The purpose of this chapter is twofold: first, to 
indicate the power and simplicity of operational methods in deriving & 
variety of formulas which are useful in various aspects of numerical anal- 
ysis, and, second, to display certain such formulas for convenient refer- 
ence and for use in following chapters. 

The operational methods which are illustrated supply only the formulas 
themselves and do not furnish the relevant error term, which therefore 
must be obtained independently. Many of the formulas could also be 
obtained by differentiating or integrating an appropriate interpolation 
formula, although it often would be somewhat more difficult to obtain 
the rule of formation of the general term in the expansion. However, in 
such cases, it is clearly possible to deduce the desired error term by differ- 
entiating or integrating the known error term relevant to the paren 
formula. 

In addition to formulas for numerical differentiation and integratio 
generally expressed in terms of forward, backward, or central differenc 
there are included certain formulas which are useful in subtabulatio 
(§5.7) and approximate summation of series (§§5.8, 5.9). 

The concluding sections (§§5.11, 5.12) deal with the problem of det 
mining an expression for the error term relevant to a formula for nume 
ical integration, when the coefficients in the formula are known. 

5.2. Difference Operators. For many purposes, it is convenient 
think of the symbols, A, V, and 8, defined in the preceding chapter, 
operators, which transform a given function f(x) into related functio 
according to the laws 

Af(a) = fe + h) — f@), vf(x) = fe) —f@ — »), 
aye) = s(2 + t) s(2- . (5.2. 


Also, in addition to the averaging operator p, such that 


we) = 52 +4) rs(2~4)} 



























inverse of L. 
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we define the shifting operator E such that 
Ef(x) = f(z +h), (5.2.3) 
and differential and integral operators D and J with the properties 
Df(z) = f'(2) 
= @) (5.2.4) 
Via) = [P's a. (5.2.5) 


eee eae except D, the spacing h is implied. When a more 
> it notation is needed, the spacin indi 
7 ; g may be indicated as a subscri 
so say for example, we could write 52,f(x) for f(« + h) — f(x — oe 
Pte a pei powers of the operators are defined by veration 
; efine the zeroth power of any operator as the identity operator ‘ 
? 


which leaves any function u 
nchanged. F 
E¢ is defined for any a so that . pene vernon era 


E*f(x) = f(x + ah), 
geass ie existence of f(z + ah). 
e say that two opera i 
a any function f(x) fe aan roa ene 
as pa Abie verte the seven operators defined here possess the 
— ee Ha ative, and associative properties shared by real num- 
; , if Li, Lo, and L; are any of these operators, there follows 


Li(Lz + Ls) = LiL + IL 
(LiL2)L;3 = L,(L2Ls), me ee 


(5.2.6) 


LoL, => Lil2, 
(Li + Le) + Ls = Li + (L2 + Ls). (5.2.7) 


‘The exponential law L™L” = L'” j i i 
pe ater is also readily established for each of 


In particular, to show that D i 
Beart and J are commutative, we make the 


d ath x 
pyc) = 2 fm a= se + - se) = [a = wpe) 
and so deduce also that 


DJ = JD =A. (5.2.8) 
We may define L~! as an operator such that 
LL-! = 1, (5.2.9) 


ao that, if Lg = lg = 
oe f then LL~'g = Lf or g = Lf, and refer to L~! as an 
is important, however, to notice that the inverse operator 


' . 
Mrs! yi be uniquely defined. For if w(x) is any function which is 
. oo : e Aes L, so that Lw(x) = 0, and if one interpretation of L~g(z) 
»), 80 that Lf(x) = g(x), then another one is f(x) + w(x), since also 


————— x — ~~". 
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Lif(x) + o(2)] = g(x). That is, we may write L7Lf = f + , where w is 
any function annihilated by L. Conversely, if LoLf =f +, then it 
must follow that LL~Lf = Lf + Le or Lf = Lf + Le, so that Lw must 


vanish. 
Tf no function is annihilated by an operator L, there follows 


LOL = LL = 1, 


and L-' is then said to be a proper inverse. Thus, whereas no function 
is annihilated by E, the operators A, V, and 6 annihilate any function of 
period h, J annihilates the derivative of any such function, and D annihi- 
lates any constant. Further, » annihilates any so-called odd-harmonic 
function of period 2h, that is, any function f(x) for which 


fia +h) = —f(). 


Hence, care must be taken with respect to the order of operations involv- 
ing the inverses of those operators. 

In the case of the operator D, it is seen that D-! corresponds to the 
formation of an indefinite integral or “ antiderivative,” and the situation 
described corresponds to the fact that, whereas the derivative of that 
integral is the original function, the integral of the derivative involves an 
arbitrary additive constant. On the other hand, it should be noticed 
that AD-'f(x) is uniquely determined, since A annihilates the arbitrary 
constant. In fact, if we write D-f(x) = F(x) + C, we see that 


AD-¥(x) = F(@@ +h) — F@) = YO), 
so that we may write also 
AD“! = J. 
This result follows also by using (5.2.8) to deduce that 
AD-! = JDD™ = J. 


In the present chapter, we will be concerned principally with applyi 
operators to polynomials. In this connection, we may notice that ea 
of the operators A, V, 4, and D reduces the degree of any polynomial, am 
that the same statement is true of any positive integral powers of th 
operators. We will refer to such operators as delta operators. It shoul 
be noticed that E, », and J are not delta operators. 

From the definitions given, we may obtain the relations 
A=E-1, V=1-E"7, § = E} —- E4, p= 7(E}+ E>), 

(5.2.11 


whereas (5.2.8) leads also to the relation 


DJ = JD=E-1, (6.9 
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so beetle iy operators are simply expressed in terms of E 
urther, if r is any nonnegative integer, there follows 
At = E’v’ = Er6" 


a (hh SH pe pe rir —1 
( 1) E TE 14 ee) re ae 


+ (-1)y E+ (-1) 


und hence, by applyi = Ww 
ying these equal 
oda tea qual operators to y(a.) = yx, we obtain 


A'Yk = Yrrr — (") Yori + (5) Yktr—2 — 
r 
Ash (") yer + (—L)'y, 


7? — r 
VY = Yr — (") Yea + (5) Yew — 


= oom r 
| sk oat Oa (1) Yr—r4t + (—1)Yn-1 
Vk = Yetr2 — (") Yrtr2a ttt 


an r 
+ (—1) a (") Yr—r/241 + (—1)"yp_+2 (5.2.13) 


‘These relations permit the calculation of an arbitrary diff 
. . . . . x i 
de meaiias pice aa without the formation a ja woe 
se eer of di eh of lower order, the coefficients of successive 
ae : merely binomial coefficients prefixed by alternating signs 
e relations of (5.2.11), we may properly deduce the asuone , 


BAe i, -BA=¥) = 


after which the formal s i 
forms ymbolism of elementary algebra suggests the 


B=1+A mapa ae 
+4, E=7;— > ER= (14+ 40%) +58, w= (1+ 20%). 


oo . (5.2.14) 
‘ : uate ge requires no explanation, the term 1/(1 — V) can be 
’ at this stage only as representing the 7 
| g the inverse of th 
V, that is, as an alternative notation for the operator (1 me eral 


(l-—V)(l—V)7=1, (5.2.15) 


Whereas the derivation of th i 
e third form shows that (1 + 462)? i 
4 
fepresent an operator such that its iterate is the haa s i ries 
a; 


[(1 + 46")']? = 1 + 48% (5.2.16) 


——— OO  i~ 
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If we now suppose 
to be effected is a polynomial DalX), 
useful interpretation of these operators. 


the identity 


that the function upon which the operations are 
of degree n, we may obtain a more 
For, if t is a variable, we have 


det eae ees seh) Star 


for any nonnega 1V e€ T Yr y t 18 p op to eplace t by V 
( b y the 8 y 1 p en tl g y 1 ributi e operator) , to gi e 
or m b ol repr es ng an oO ther d st V V 


Qovndt+vtevtr co WR 1—vrtt, (5.2.17) 


i .2.17) is 
Since the operator V"*" will annihilate pn(x), the operator pen ) . 
haeulen to the unit operator for any pr(z). Since the daa 
ie uniquely defined by (5.2.15), it follows that we may 


Qo altvtRter re 


the 
when only polynomials of degree n or less are to be affected by 


operator. More generally, we are justified in writing ; 


= 5.2.18) 
—o21¢VtV?+-° ) ve ( 
Baa y)S 14+Vv+ 4, 

when the class of all polynomials is included, since the finite number of 


ree 
re lired terms for which the exponent of V does not exceed the deg 
qt ’ 


ae ‘hilat 
f the polynomial is present, and the remaining terms each annihi 
of the ; 


ial. 
i eee way, it is easily seen that if we retain only the terms whic 
n ? 


involve powers of 6 which do not exceed n in the formal expansion 
inv 
(1+ 4oy) = 1+ $5" — sae 
omial in 5 possesses the property that its sorts “ 
1 + 162 by the product of 5"*? and a polynomial in 5, = - 
as ient to 1 + 46? for any pr(x). Clearly the negative : the se \ 
aa also has this property. sae ae Siti : < 
of the third relation in (9.4. rbitr 
ae constant) shows that the former alternative 1s the pro 
bociy . . “3° 
one, so that we are justified in writing 


Eh= (1 +40) + 481 tab +3 — ° 


the resultant polyn 


(5.2.1 
when we deal only with polynomials. 
It then follows that we may write 
Br = (14a) = (1 — Vt = [Gh + 28) + 


i 1 & 4 1g 


(5.2. 
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only with polynomials. The extension to the more general case when s 
is any rational number offers no difficulties. It is also possible to give a 
rigorous direct justification of the use of these expansions when s is 
irrational, although the required arguments are somewhat more subtle. 


The first two equivalences in (5.2.20) are, in fact, seen to be symbolic 
representations of the relations 


p(xo + sh) = Ec A¥p(x0) = ce) Vip(ao), (5.2.21) 
d,(¢) ataten = 9) (G 


k= 


to which the previously established Newton forward- and backward- 
difference formulas (4.3.11) and (4.3.13) reduce when f(x) is replaced by 
a polynomial p(x), since only a finite number of terms then do not vanish 
and since the remainder term also vanishes. This fact can be considered 
as constituting an indirect proof of the validity of (5.2.20) when s is 
unrestricted. 

As was discussed in §4.11, the series in (5.2.21) frequently do not con- 
verge when p(x) is replaced by a function f(x) other than a polynomial; 
they must be truncated, say, after n + 1 terms, and the appropriate error 
term (4.3.6) or (4.3.9) then must be added. However, the coefficients in 
the formula are not dependent upon the nature of f(x), and the present 
operational methods serve to determine those coefficients in a simple and 
systematic way. 


The equivalence of the extreme members of (5.2.20) can be expressed 
in a variety of forms, such as 


E* = [(1 + 76?)4 + 36)” = [1 + 35? + 5(1 + 46%)A]° 
= (1 + $5? + wd)* = (1 + E48) = (1 — E-¥8)-*. (6.2.22) 


The operational formula obtained by expanding the first or second of 
these expressions would correspond to an interpolation employing the 
ventral differences 5?”+1f(a9) as well as the central differences of even 
order, Since the former differences are generally not available in tabular 
work, this formula would be of limited use. The Stirling formula could 
he obtained by expanding the third expression and afterward replacing 
wi’ by (1 + 67/4)" and p?"+! by w(1 + 62/4)". The two Gaussian for- 
intlas could be obtained from the remaining two expressions. Since the 
results have been obtained in the preceding chapter (see also Prob. 5), 
(these calculations are omitted here, and other derivations are indicated 
in the following sections. 


In the remainder of this chapter we shall proceed, in general, as though 


we were concerned only with polynomials, and we shall indicate this fact 
hy writing p(x) in place of f(x). 
heon established as identities for any polynomial p(x), in which case all 


Formulas to be obtained will then have 


———————————————— “~~ 
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i i The deter- 
ill effectively terminate. 
‘es of delta operators will e ste 
snare wre proper error term to be introduced ine es a ee 
aren a function of more general type, after a trunca . eakcosay 
APP d. is then to be considered in each case as a separa prot ane 
aphries rentiation Formulas. In order to obtain formu : See 
teen ‘onal methods, it is necessary to 
i i tiation, by operational me ; serie 
ae an pantie For this purpose, we notice that t 
er . #F 
ies of the Taylor-series expansion 


h" . . 
h., We ng) bo EPMO) + ° 
pe +h) = p(t) + TP (x) + 5)P (x) at ag 


Dp, wD, ... _ rp _--) pe. 
ppt) = (1+ + or + sale a 


e e e h 1 D 

i ion function e” we 
Since the series in parenthes $18 the expanslo of the ; 
inc : 
deduce the curious relationship 


ae, (5.3.2) 


iati tatement that the 
ich i i g an abbreviation of the s 
niece neuen BOs (hD)"/n! are equivalent when 
ators . 
coolied to any polynomial pal) of degree n, for any 
T rarthier we obtain the additional relations 
’ 


AD = log E = log (1 + A) = —log (1 ot - 
= 2 Jog [(1 + 462)! + 48] = 2 sinh 5 (5.3. 


; rts that 
Here, for example, the symbolic pet et : ie eevee" are 
: D and P(A) =4—A oe : tice 
ee ee ie pr(x). Its validity can be verified sg a 
Cake ies Aand hD/U!+ + °° + wage seer ce erator in th 
ea + for any Pn(x), we may replace A by this © “i mial of th 
equivalent 'P.(A), The result will differ from kD by a polyno 
gakuen a + an(AD)%, which will annihilate sa 
p= pee of forward differences, we thus deduce the formu 


1 1A8— ++") Mo, (8.0.4 
py = flog (1 +4) po = 5 (A = aAt F 98 ) Po 












OPERATIONS WITH FINITE DIFFERENCES 


135 
By iteration, there follows also 
1 
py = 7, llog (1 + A)lrpo 
1 
mai — gA + $A? — oe ‘)" ATpo 
1 r r(3r + 5) r(r + 2)(r + 8) 
= rw | Artl r+2 __ r at oier ne: 
z(4 a a aa ag A Po 
(5.3.5) 


The coefficients in this expansion are expressible in terms of the 


so-called Stirling numbers of the first kind, which may be denoted by 
Si”, and which are then defined by the relation 


C) 


[log (1 + A)}r sp, 
flog (1 + A} _ > Se a, 


rl (5.3.6) 
k=r 
so that (5.3.5) can be written in the form 
1) S@” So So 
Hest, |Pees ttl r+2 2 oe Be r. 
Po «|S era + Goapele | a. (5.3.7) 


In a completely similar way, the corresponding backward-difference 
formulas are obtained in the form 











1 

pa = —zlog(1—V) pa =T(V FAV + 3V8+4 - +) p, (5.38) 
and 
p= La + av tavet a ‘7 Vin 

PR bee r(8r + 5) or4e 

=@(¥ age es pe 

+t oer) a Ds 
_1fse _ se, Se oa 
ak wel Gane poy” 


. | V'Dn- (5.3.9) 
Irom the last form of (5.3.3), there follows symbolically 


7 2.,_.6 
p= k sinh 3 Do, (5.3.10) 


and several types of central-difference expansions are possible. Since 
the right-hand member is an odd function of 4, its expansion in powers of 


§ would involve odd central differences, which are not generally useful 
i tabular interpolation. 


—— x é-" | 
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To obtain a result equivalent to that of differentiating the Stirling 
fC) 


i ion involv- 
i sult at xo, we require an expansion 
pirrnhaatey sed Hence, by multiplying the right-hand 
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formul 
ing mean odd central differences. 


. sn th 
mber by » and dividing by its equivalent V1 + 82/4, we obtain the 
me 
a , 24 sinh 6/2. 
Poh Vi + #/A 
arr epee Se en ‘ p. (8.3.11) 


at interior tabular 
d at end points of 
ently interpolated 


i leulating the derivative 
‘5 formula is useful for ca 

ee whereas (5.3.4) and (5.3.8) would be required 
the tabulation. Intermediate values are then convenl 

these values. 
malar derivatives of even order 2m are ob 
D2", where D is expressed as 1n (5.3.10), 

’ 


tained by use of the operator 


6 
D = Zsinh"5 

2 12-3? 12-32- 5? 87 + . ‘ (5.3.12) 
a Ce ee 28 - 7! 


=; | 
btained by multiply- 
i atives of odd order 2m + 1 areo 
nee spaaarteene by D?”. Thus, for example, we may obtain 
ing the op Oe 


the formula 








2 
~Q2. 52 
1? 2-3?,, 1-3 rte) 
pt = p3- grag! t org? 2-7! 
3.13 
= 1 ga — yyit + sod! — who! + ° Pe (5.3.18) 
= 7 
Other formulas obtainable in this way may be listed as follows: 
nm = #58 — 385 + rh08" — °°) Pos (5.3.14) 
Po = h3 
: 8 — ++ *)po, (5.3.15 
iy = py (88 — 48 + ahd )po 
o gigs (5.3.16) 
AEH eh eee | 


duced in each case, when p(2) is replac 
i t the seri 

+ a polynomial, so tha 

. s. can be determined by use of 

, uncation correspo 


The error term to esto 
i hich is no 
by a function f(x) w } 
inated with, say, rth differenc 
ie of §3.3, if it is noticed that the result of red . 
to the differentiation of a polynomial of degree : 
| at Xo, , t in the case of (5.3.5), Ot any Bnaty + + 
\ a ae Bd * 


es must b 
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case of (5.3.9), and at 2, t41, . .., Zim in the case of the central- 
difference formulas, where m = r/2 if r is even, and m = (r + 1)/2 if r 
is odd. In the case of the forward- and backward-difference formulas, 
when the differentiation is effected at an end point of the range, formula 
(3.3.20) is valid. However, in the central-difference formulas, the more 
complicated formula (3.3.15) cannot be avoided. In practice, unless 
the analytical definition of f(r) is known and is of sufficiently simple form 
to permit the determination and estimation of corresponding analytical 
expressions for the higher derivatives involved in those error terms, one 
generally must estimate the error by considering the magnitude of the 
first term omitted, realizing that this estimate is not necessarily a depend- 
able one. The importance of inherent errors in numerical differentiation 
has already been emphasized in §3.8. 

Formulas for differentiation at a point midway between two tabular 
points are obtained by writing D in the form (5.3.12) and operating on 
p;. In addition, in order to obtain ordinary odd central differences and 
mean even central differences at s = 3, we must multiply the expansion 
by the unit operator u/+/1 + 62/4 in calculating derivatives of even 
order, whereas in the preceding case this device was necessary when cal- 
culating derivatives of odd order. Thus we have, symbolically, 





pn a | ee (5.3.17) 
V1 + 6/4 [4 2 
2m4+1 
and pyerty) = k sinh 3 | Dh. (5.3.18) 


In particular, when m = 0 in (5.3.17), we thus rederive (4.6.8) in the 
form : 


Py = wl — 30? + roe54 — roerd® + yat'eed® — + * -)py (5.3.19) 
und obtain also derivative formulas which may be listed as follows: 
pj’ = ; (5 — y6° + siyd® — ares’ + +: Dy, (5.3.20) 
Dy" = 75 (8 — ard! + aloo — whHE0d* + + + -)py, (6.8.21) 
py!" = ‘i (8° — 38° + 18508" — + + -)ps, (5.3.22) 
pi” = 5, (8 — ard! + ated — + + my, (5.3.23) 
ad 7 (8° =~ 08 tL (5.3.24) 


In certain applications, it is desirable to express differences at a point 


in terms of derivatives at that point, This is the inverse of the problem 
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just considered. Thus, in order to express forward differences in terms 
of derivatives, we again refer to (5.3.3) and obtain the relation 


hD , h?D? re 
At = (e&bP — 1)" = Trt Dae), 





(5.3.25) 


Thus there follows 
Atpo = | «my ve 5 (hD)"*! + sae (hD)'+2 
+ Mee) (AD)? ++ ° | po. (5.3.26) 


The coefficients in this formula are expressible in terms of the so-called 
Stirling numbers of the second kind, which may be denoted by §(”, and are 
then defined by the relation 


(ekD = 1)" _ 
r! _ 


y, Sf aD 
k=r 


(5.3.27) 


so that (5.3.26) can be written in the form 








SP Sth Site - 
A’po = | 1 + maa | (hD) + @ +1) + 2) (AD)? + -|:00D'p. 
By comparing the relation (5.8.28) 
(—vy =? — 
with (5.3.25), we see that a corresponding formula for backward differ- 
ences can be obtained by replacing A by —V and D by —D in (5.3.26) 


or (5.3.28). 
Similar formulas involving central differences are readily obtained from 


the relations 


(5.3.29) 


§ = 2 sinh », p = cosh ”, yd = sinh hD. (5.3.80) 


Thus, for example, there follows 


ubpo = [(AD) + @(hD)* + rig(AD)§ + >: ‘Ip0 (5.8.81) 
8p) = [(hD)? + ya(hD)* + win(hD)* + °° [po (5.3.32) 


5.4. Newtonian Integration Formulas. For the purpose of obtaining 
formulas for numerical integration, we may make use of (5.2.10), 


J = AD", 


and 


(5.4.1 


combined with one of the relations of (5.3.3), 
Thus, to obtain a formula involving forward differences for the app 
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. . : 
imation of the integral " 


| is f(z) ne 


xz 


we may notice first that, when = i i 
ee f(x) = p(a) is a polynomial, this integral 


Q+E+E24... ee Er — 1 
1 +E )Jpo = Eo] JP. 


H ‘ f 
ence, by expressing E and D in terms of A, there follows symbolically 


[rr ne dx =h | + a 2 | E Ps | Se ED) 


The expansion of the first operator is easily found to be 





Cc) 


Cty nt r\, 
A (i) 


i=0 


(5.4.3) 


whereas the expansion of the second factor may be written in the for 
m. 


A 
log (1 +A) > cjAi, 
0. 


(5.4.4) 
j= 
the first nine coefficients of which are 
c= 1 a=% = ; 720 
) 1 DQ pom ~T?) C= v4 a= a Cs = 73 
= * "th 

Cs ghtts. C7 = 2s wet 33953 Py 
Z192) Cg sezssoo- (5.4.5) 


Ilence the operator involved in (5.4.2) can be expressed in the form 


)) 2 ‘ ( + : pene : fy ves oe 1) A‘, (5.4.6) 


k=0 i=0 
‘hus, if we write 


k 
#-JaGi)-sOtaG 
4 ke ‘ a4 Ck 1 + Ck-1 5) + ta Big (5.4.7) 


whe re the series ter minates when the subscri t of Cc Va ishes 07 when the 
r p n 


arguments of the binomial coeffici 
prince te cient become equal, the required for- 


ay p(«) dx =h ( ) aj? A") Do. 


4 (5.4.8) 


————————— -"“". 
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In particular, in the case r = 1, there follows o = Ce and (5.4.8) 
’ 


becomes 


(as p(n) de = hl +44 — aA? + ced? — tod! + re04® Bice) igs 


(5.4.9) 


in th 
In the case r = 2, there follows a, = 2k + Cra; and we may obtain the 


formula 


(ern p(x) dx = 2h(1 + A + 1,2 + OA® — aod + rivh® + - + *)Po 


(5.4.10) 


Further, in the case r = —1, there follows 
Oy = —Ck + Ce—-1 — Ck-2 + -++ + (—1) "60, 


and (5.4.8) becomes 


le p(x) dx = h(1 — $4 + syd? — 80° + Head! — deh? + °° *)Po- 
zo—h 


(5.4.11) 


This formula amounts to the result of using the Newton ae 
extrapolate f(z) = p(x) backward over the interval (ao — fh, Xo 


integrating the result over that interval. 


Similar formulas are easily obtained in terms of backward te 

ose of integrating a function over r intervals termi : 

= ene tabulation. It may be seen that the formula for integra ing 
oe . . rh to tn can be obtained from (5.4.8) by replacing A by —V@ 


po by Dn. Thus, for example, one has 


19y4_ 38 5 — : > \Pn 
* = A(L — $V — oe¥? — oe" — yee" — Te0 
eee eee (5.4.12) 


and 


[o7* pte de = WO + ay + diet + at + Hee eee Pm 


In 


(5.4.18 


the last formula being useful for integration over an interval bey 7 
range of tabulation and playing an important role in the nume 

i differential equations. 
health case, the error term to be introduced when p(x) is Bist 
f(x), and the series is truncated with the nth difference, can : - a 
in the form given by (3.3.5). Thus, in the case of (5.4.8), the 


E, = hoe fe (a — ao)(w — ®1) 











vee (w= aafern(t) de, (6.4.14 


OPERATIONS WITH FINITE DIFFERENCES 141 


where £ depends upon z and lies between zo and the larger of x, and 2n, 
and an analogous term applies to the formula with backward differences. 

When r = 1, or when r is a negative integer, the coefficient of f+» in 
(5.4.14), which has been denoted by (zx), does not change sign in the 


interval of integration, and the second law of the mean can then be applied 
to give the more useful form 


a po, : r(z)de (rS1). (5.4.15) 


A reference to the form of Newton’s interpolation formula, from which 
the preceding formulas may be obtained by integration, shows that, 
when (5.4.15) applies, the error term is obtained by replacing A*po or V'pn 


by hf (n) in the first nonvanishing term omitted. Thus, for example, we 
may deduce from (5.4.12) that 


/ Sle) dz = h — AV — VO. — BG), 
tn—h 


where %n — 2h <7 < 2n. 


In those cases when n = 1, so that the number of differences retained 
is equal to the number of h intervals in the range of integration, the for- 
mulas reduce to Newton-Cotes formulas when expressed in terms of the 
ordinates, and the error terms can be supplied by reference to the results 
of §3.5. Thus, for example, we may deduce from (5.4.10) that 


zo+2h h® 
[ f(x) dx = 2h(1 + A + GA) fo — 90 f °C); 


where 29 < 9 < 2 + 2h, and the formula is equivalent to Simpson’s 
rule. 


If the terms involving A and A? in (5.4.10) are expressed explicitly in 
terms of the ordinates po, p1, and pe, the result takes the form 

“tot+2h h h 
[pte ae = bos + 4m. + vy — A (at — a8 + Hae — - + po, 


(5.4.16) 
which may be considered as Simpson’s rule with ‘correction terms” 
expressed in terms of forward differences, for use at the beginning of a 
(nbulation, The corresponding formula with backward differences is 

' h h 4 5 3776 
p(t) de = 3 (Dn + Apn-1 + Pn—2) — go (V4 + Vo + GEV + - * *)Dne 
fae 2h 3 90 
(5.4.17) 





—— Exo ——  #=-"“" 


142 INTRODUCTION TO NUMERICAL ANALYSIS 


5.5. Newtonian Formulas for Repeated Integration. It frequently 
happens that the second derivative of a function F(z) is known, 


F(a) = f(a), 


and that F(x), and perhaps also F’(x), are required at a set of equally 
spaced points Zo, 21, + + + 1% with the values F(20) = Fo and F’(a0) = Fo 
prescribed in advance. In order to treat this problem operationally, 
without being concerned with remainder terms, we again imagine that 
F and f are replaced by polynomials and denote this fact by writing P 
and p for F and f, respectively. 

If P’'(x) is tabulated at the points to, . - - 
be made, say, of (5.4.9), written in the form 





(5.5.1) 


















, Ln, it is clear that use may 


Ply = Pht hd + 2A — yA? + ord? — Food‘ + red? — ‘Py, 

(5.5.2) 
where h is the spacing, to obtain a corresponding tabulation of P’(z), 
after which the same formula may be used again, in the form 


Pyar = Pe + RCL + 48 — red? + aed? — rao! + ane ) 
(5.5.3 


to determine the desired tabulation of P(x). Clearly, the formula 
(5.4.12) could be used instead and would be needed near the end of the 
tabulation if the value of Pi’ were not available for k > n. 

This procedure involves the formation of difference arrays relative to 
both P’(x) and P’’(x). In order to avoid the necessity of two such 
arrays, we may transform (5.5.1) into the form 


Pasi = Pe t hPy + [2 [pO dae (5.5.4) 


and seek an operator 9 such that 
ke iis p(t) dt dx = pr. 
Xk xk 


Thus @ must be such that 
(E —1 — AD)P; = Ope = OD?Px 


(5.5.5) 


and hence 





E—1—AD _ ,,B—1—logk 
D? (log E)? 
In terms of the operator A, there then follows 
oo hoe +) p [abe ez A | 
flog (1 + 4)}? A? log 1 +4 
= 3-34 +40°- °° C1 +44 — yet? + * )? 
= hd + ad — ged? + aed! — hed! + ribbed! + °°): 


me (5.5.6) 





(5.5. 
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so that (5.5.4) takes the form 
Pau = Pet APL + WG + $A — eA? + deA — cheA! + rd ozeAS 
080 
+: -)Pi. (5.5.8) 


Thus, if (5.5.2) and (5.5.8) ar 

‘ : Oo. e used for the calculati i 

and P(z), only the differences of P’’(x) are ia een deers 
In a similar way, the formula 


Piao? a 
ke] t+ AP, — WG — 3V — ok? — ZgV? — zig! — 7h 82,505 
+ Phy (6.5.9) 


can be derived for use near the 
end of a tab i ‘ i j 
tion with (5.4.12), written in the form aa a 


P! = Pp! | 
k+l z+ ACL — aV — yeV? — gaV? — agV4 — ro5V5 — - +) PULL. 
(5.4.12’) 


In those cases when val 
ues of P’ a i i 
more useful formula by noticing that sia as Sahl, 


VDP" (x) = V*P(z), (5.5.10) 


| es fe 
where the factor V? is inserted to annihilate the arbitrary linear function 


of « which would correspond t i i 
ei p o the (improper) inverse operator D-?, 


ae Se 
ad — log (1— V) Pe 
= A2(] — 
or el av — rv" — + ‘Py 
VP, = hPL — V + eV? + OV? — gigV! — gigV> — 


$e torts at 
Thus, since V2P, = )Py. (5.5.11) 


P, — 2Px_1 + Py_2, this formula permits the deter- 


mination of P, from two i 
preceding values of P i 
that the determination of P; makes use of P?’ a ial 
he 


Another formula, in which onl } 

Hee y preceding values of P’’ i 
obtained by operating on both sides of (5.5.11) by E, and a ee ts 
(1 — V)~ in the right-hand member, to give ak 
sa emt isc larg 1 lh eae a) 2 

= WL + OV + eV? + SV? + Sy + AV + PY 
hes 
(5.5.12) 
In fact, a whole series of formulas of either type can be obtained 


for example, by operatin 
St iv biaieti « g on both members of (5.5.11) or (5.5.12) by 


+ , where the a’s are arbi 
ae o e arbitrary constants. 
uch formulas are particularly useful in the numerical solution of differ- 


ential equations (see §6.12), which include (5.5.1) as a very special case 
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In order to illustrate the use of these formulas in connection with 
(5.5.1), we consider a simple example. It is supposed that the values of 
F” listed in the table are known and that the values F(1) = 0 and 
F’(1) = 1 are prescribed. 











x F F' F” AF” A2{R” A2'F” 

1.0 0 1.000 1.000 

331 
1.1 0.1055 1.1160 1.331 66 

397 6 
1.2 | 0.2244 1.728 72 

469 6 
1.3 0.3606 2.197 78 

547 6 
1.4 : 2.744 84 

F 631 

1.5 . 3.375 








In order to determine Fi = F(1.1) and F, = F'(1.1), we use the approx- 
imate relations resulting from replacing P by F in (5.5.8) and (5.5.2): 


Fy = 0 + (.1)(1) + 0.01[5(1.000) + §(0.331) — 2,(0.066) + 25(0.006)] 
= 0.1055, 
Fi = 1 + 0.1[1.000 + 3(0.331) — r3(0.066) + 2,(0.006)] = 1.1160. 


Formula (5.5.8) is then used again to determine Fs. For the evaluation 
of F;, sufficiently many backward differences are available for the use 
(5.5.9) or (5.5.11). Hence, unless values of F’ are required, F’, need nm: 
be calculated, and F's may be determined by (5.5.11): 


Fa © 2F, — Fi t+ W(Fy — VFS + osVFs) 
— 0.3433 + 0.01[2.197 — 0.469 + 1'z(0.072)] = 0.3606. 


From this stage onward, use may be made exclusively of (5.5.11). 

In this example, the given data are exact values of F” correspon 
to F’(2) = «', from which there follows F(z) = 0.0525 + 0.752 — 0. 
and the results are correct to the places given. Since here the third d 
ference of F’’(x) is constant, exact values would have been obtained if 
intermediate round-offs had been effected. A check on the calculatio 
which would be useful if the last difference retained were not consta 
would be afforded by the use of (5.5.12). 

5.6. Central-difference Integration Formulas. The most useful in 
gration formulas involving central differences are those in which the diff 
ences are evaluated at the center of the range of integration, and 
integral is expressed in the form 


bam mh S(w) ds. 


wom mh 
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In terms of the oper ator J define in 5 2 5 this n I 
d 


(E" + E-mti4... m—1 fo 
+ E”™™)Jpo = Eup JP 
a emhD oat e-mhD 
ee eR 
when f(x) = p(x) is a polynomial, and hence we may write 
xzotmh : 
ae ee sinh mhD 
ae p(x) dx es 5 ae Po. (5.6.1) 


In order to obtai i 
ain an expansion in cent i 

. ss i 
hee. al differences, we first obtain 





9 anhinkD ees [: 4 m?(hD)? ru m'(hD)* | m5(hD)é 
6 120 * 5040.7 a 
and then replace hD by its expansion given in (5.3.12), to give 
2 sinh mhD eo [: m5? 2 
are + 7 (1 = gd? + ahodt + +)! 
454 
+ oy mast + EE 


ne ’ 
. 





“xot+mh 
2 
| p(x) dz = 2mh 1 +5 ge (Bm? — 5) 3 





J to—mh 360 
m2(3m4 — 21m? + 28 
4. ) 
15120 a a | po. (5.6.2) 
In the special ca = = 
re ses m = 1 and m = 2, the relevant formulas are of the 


woth 
[o") P@) da = 2h + 38° — rhod! + rebrzd" — ax bbend? — - 


. *)po 
and (5.6.3) 
[" 2h (z) Si aa . 
y= 2h Pp x = 4h(L + 36? + 5564 — phy 58 + sefsd®t — )po. 
(5.6.4) 


lormula (5.6.3) can also be expressed in the form 


“woth 
h 
= Ah 
I. i p(x) dx 3 (far + 4fo + fi) — 90 (54 — fy 8° + e805 — +: +) Do, 


(5.6.5) 


———————— “~~. 
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and so considered as Simpson’s rule with “eorrection terms” expressed in 


terms of central differences. 

It is known (see Steffensen [18]) that, if p(x) is replaced by f(x) in 
(5.6.2) and the formula is truncated with the difference of order 2k, then 
the error term to be introduced can be expressed in the convenient form 


(2k+2) zotmh 
FEED) ee eee 


Ex = WOE Ot | nn a(x? — xi) (x? — 2% 
(5.6.6) 


where to — mh <1 < ao + mh if k S$ mand x — kh <n < 20+ khif 
k =m. Reference to the Stirling interpolation f ormula, from which the 
preceding formulas may be obtained by integration, shows that this error 
term is obtained by replacing S22 by N*+2f Rt? (n) in the first nonvanish- 
ing term omitted. 


An important f 
by noticing that, since 


ormula, relating to repeated integration, is obtained 


8D—P'" (x) = &P(2), 
and since we have the expansion 


62 


io By? + pod! — wad + 


At att — debt Erb H+ Gat = OE 


i] 


+ (e+: P+ ae 


from (5.3.12) and (5.3.18), there follows 
oP. = h(1 + pee = io! + 31 56 — -- ‘Py. 


Because of the fact that only differences 
this formula is usually prefera 


double integration of a given ta 
range in which the requisite central differences are available. 


The formula (5.6.7) also will be used in the numerical solution 
boundary-value problems gov 
equations (§6.17), whereas the analogous formulas 
are to be used for corresponding initial-value problems (§6.12). 

5.7. Subtabulation. In many situations i 
from a given difference table based on the spacing h, an 
ences based on a new spacing ph. This probl 
ple, if a function were initially tabulated for in 
were required to subtabulate 
case p = 0.1. Whereas this subtabulation clearly could be effected 
the use of an appropriate interpolation formula, it is often more conv 
ient to form certain new differences, based on the new spacing, and 


ew set of diffe 


A, = p" {a + ee ArH r(p 1) 


(5.6.7) 


of even order are involved, 


ble to (5.5.11) for advancing a step-by-step 
bulated function, over the portion of the 


erned by certain second-order differential 
(5.5.11) and (5.5.12) 


t is desirable to determine, 





em would occur, for exam 
crements of 0.1 in « and I 
the function for increments of 0.01, in whie 


oO w 
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na step-by-step numerical solution of a differential equation in which 
d 


case a halving of the inte i i 
pra i rval is desirabl 
solution being determined increases. i ae 


In order t i 
o obtain formulas for such purposes, we designate the shifting 


operator relative to the new i 
spacing ph i i 
effects an h shift p times, there Falla: peers ayers 


E, = E>. 


If we designate the forw: i 
ard- : 
then follows difference operator relative to ph by Ai, there 


i 1+ A; = (1+ A) (5.7.1) 
! . . , 
ind hence we obtain the desired transformation in the symbolic f 

orm 


Ar = [G+ ay — 4) 
= A + ple — 1) 2 p( —1 _ r 
E a1 A? + lp — Ne ~ 2) At +o: | - (5.7.2) 


The leadi . : : 
eading terms in this expansion can be obtained in the form 


7 [4(o — 2) + 8(r — 1)(p — 1)]ar+* 


r(p — 1) 
+ Gg [2(e — 2)(0 — 3) + 4(r — 1)(@ — 1)(p — 2) 


i 1) = 2) (p= Da se os |. (5.7.3) 


| n particular, in th a 
; : , e important in 
ne : case p ) sO that the spacl £ is halved, 








1 1 
Ar = (A — 1- 62, i 
a| (5a gem At + grag At — Geb att -:) 





=o [a _Tyrtt MO H3) pay rr $4 
4 + 39 area — et ae 
Whereas the formula reduces in the case p = 5 to ot) 
= 10 
At = 10— [a _ 97 nner y Br(27r + 49) 
egg ee 
_ 8r(81r? + 441r + 580) 
16000 a a (5.7.5) 


In ord i i Ww e 
er to illustrate an appropriate technique, we again consider th 
’ 


lata t i 
abulated in §4.8, where a difference table is constructed with 





——— OO = — 
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spacing h = 0.1, and suppose that the data are to be subtabulated by 
tenths, that is, with a new spacing 0.01. Here, with p = 0.1, Eq. (5.7.5) 
gives the f ormulas 


A, = 0.1A — 0.045A2 + 0.02854? — 0.020662544 + °°"; 
A? = 0.01A? — 0.009A* + 0.007725A* — - °° 5 (5.7.6) 
AB = 0.0014? — 0.00185A* + °°", 
At = 0.0001A' + °° *, 
through fourth differences, where the coefficients have been expressed 


exactly, for convenient reference. In units of the fifth place, rd nee 
forward differences relative to 7 = 1.0 and x = 1.1 are found as follows: 


ll 


= 10: A = 5362, A? = —8.5, At = —0.08, A‘ = 0.0008 ~ 0. 
peli: A= 449.1, AP=—89, At = —0.05, At = 0.001 ~ 0. 


Thus, we may suppose that the third differences are constant (wit 
the accuracy indicated) over the first range, and we may set up the 
underlined entries in the following table: 














x f 4f A*f A‘f 

1.00 | 0.84147 

536.2 
1.01 | 0.846832 —8.5 

527.7 * —0.05 
1.02 | 0.852109 —8.6 

519.1 —0.05 
1.03 | 0.857300 —8.6 

510.5 —0.05 
1.04 | 0.862405 —8.6 

501.9 —0.05 
1.05 | 0.867424 —8.7 

493.2 —0.05 








The remaining entries are then filled in by addition, proceeding from 7 
to left, and the results round, correctly to five places, to known roun 
= sin a. 
sb ae parts of units in the fifth place are retained in oe 
reduce the danger of propagated effects of round-off errors. Since 
the errors are propagated to the left, and since (see §4.9) then Tn 
magnitude e in the rth difference could lead to errors of one 
in the calculated values of f, it follows that if no errors of one-ha + 
are to be so introduced, the round-off errors in the rth differences sho 
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be smaller than 2-’—! units in that place. Hence, for this reason alone, 
at least one extra place should be retained in the intermediate subtabula- 
tion of f and in the first two differences, two extra places in each of the 
next three differences, and so forth. 

If backward differences are used, we see that (5.7.1) must be replaced by 


1—Vi=(1-V)?, 


and hence all the formulas of this section are transformed to corresponding 
formulas for backward differences by replacing Ai by —V1 and A by —V. 
Formulas using central differences may also be found in the literature 
(see Prob. 16). 
5.8. Summation. The Euler-Maclaurin Sum Formula. The prob- 
lem of evaluating a sum 


k 


-1 
} fl(ao + vh) = fu thes tic thea (k>m), 


v=m 


where f, = f(zo + vh), is closely related to the problem of determining 
a function F(a) such that 


AF (x) = f(x), (5.8.1) 


since, if any such function F(z) is known, there follows immediately 


k—-l 
D, fo = mtr — Pm) + (Pmeet — Fmgs) $00 
+ (Fria — Fe-2) + (Fe — Fe-1) 
= F, — Fp, = [F]E. (5.8.2) 


[t should be noticed that the upper limit in the last term exceeds by unity 
(he upper limit in the original sum. 


If we invert (5.8.1) in the symbolic form F;, = A~'f;, it follows that we 
nay write 


bk-1 sik s; 
Af, = C+ Sf. Yo fe = Plo (5.8.3) 
v=m yv=M, 


where C is an arbitrary constant and m is an arbitrarily fixed integer such 
that m < M <k. Thus we may refer to A~!f, as an indefinite sum of 
/, and may correspondingly consider indefinite summation to be the 
inverse of the process of differencing, just as indefinite integration is the 
inverse of differentiation. As was noted previously, to any one inverse 
A Yf(@) we may add any function «,(«) which is of period h, since any such 
function is annihilated by 4. However, if only values of x which differ 
from some fixed value a» by integral multiples of h are involved, then, 





—— OO i“ 
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for that set of values of x, the additive function wn(2) reduces to the con- 
stant C, which itself disappears in definite summation between ae 
A ample formula for summing any polynomial p(x) is obtained by 
writing 
r-1 mt 7 ve i 
p= (tE+E*+:--+& Po Fey PO 


k=0 


_@+ay =? 


_ S a ae 
a [ea Mega aD are | (5.8.4) 


Thus, for example, in order to sum the series 1? + 2? +--+ aoe 
may take p(x) = (@& + 1), to = 0, and he 1. With po = 1, Apo = 3, 
A’p = 2,A°*po= °° = 0, Eq. (5.8.4) gives 
r —1) r(r — 1)(r — 2) © 
Ppotpe dren TES a 3! 4 
4r(r + 1)(2r + 1). 













ll 


The formula (5.8.4) is principally useful for the finite summation . ‘ 
polynomial of degree small relative to the number of terms ve i - 
the number of terms in the transformed series is small relative to 1 
inal number. In order to obtain a formula which is of more genera ue 
fulness in finite or infinite summation, as well as in numerical pe ‘7 
we again first restrict attention to a polynomial p(x). From the rela 


Ap(x) = DJp@), (5.8.5) 


we may deduce also that 
hD 
hp(z) = \ao — 
The coefficients B; in the expansion 


hD B, ‘ 
wt y, =; (nD) 


v=0 


(5.8.6) 





:) Jp(@). 


(5.8.7) 


are the so-called Bernoulli numbers, which occur in many cath a 
ematics.t It is found that Bs = B; = B, = --- = 0, and the follo 
ing additional values may be listed: 


= = =e By, = ; B, = —-a) By = 43; 
oe i = i O os ee, é By = —91'85) A oe iH 8 
ag, Bis = SHEE, Ba = — Sitio. 


¢ The notation B, is sometimes used for the present By. 
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Hence, with this notation, (5.8.6) can be expressed in the form 
B, 
hp(z) = 2 =) he D’Jp(z) 
v=0 : 
ath ~ B 
or hp(x) = / p(t) dt + >, a hy D’Jp(z2). (5.8.9) 
v=1 


By using (5.8.5) to replace D’Jp(x) by D”—[p(a + h) — p(x)], we may 
express this result in the more explicit form 


oth = y 
p(x) = ; / p(t) dt + » B he [pe (a +h) — p°-P(a)]. (5.8.10) 
v=1 


If we write (5.8.10) for 


L= 4%, =A t+h,... ,t1= Xt (r— Lh, 


and sum the results, noticing the ‘‘telescoping” of the resultant terms in 
brackets, we deduce the identity 


r—1 


> De = : iB p(x) dx + >, = h-Mpe-Y — pe-Y], (5.8.11) 
v=1 


k=0 


where px = p(rx) and pY-? = pC—(a,). This result is usually known 
as the Euler-Maclaurin sum formula for a polynomial, although that name 
is also sometimes applied instead to (5.8.10), which leads to (5.8.11), or 
to still another formula, which generalizes (5.8.11). 

It can be written in a somewhat more convenient form by making use 
of the fact that all Bernoulli numbers with odd subscripts greater than 
unity are zero. Thus, if we extract the term corresponding to v = 1, 
and afterward replace v by 27, we obtain the form 





heO 


>, m= pla)de +2 (pot po) +) Pe neeapen — pen] 
athe a Is 2 . (22)! : ee 
i=1 


(5.8.12) 


If the degree of the polynomial p(x) is 2m or 2m + 1, the series terminates 
when t = m. 

When f(x) is not a polynomial, the result of replacing p(x) by f(x) in 
the series (5.8.12) must be terminated, say, with ¢ = m, and an appro- 
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priate error term must be introduced, so that we write 


; es % Bo 7 1 94 
ye =; [ "Hla) dz + 5 fof) + ya D — JE] + By. 
k=0 i= 


(5.8.13) 





It is known (see Prob. 23) that this error term is expressible in the form 


Boms2h?™t 


2 
= 2m+2 
En =? (Qm + ait +2)(£), (5.8.14) 
where 2) < £ <2, when r is finite. When r— and also z,—> ~, 
this form becomes indeterminate and must be replaced by a somewhat 
more elaborate one. 

We see that this result relates a given sum and an integral in terms of 
an associated sum of m terms and a corresponding error term, where m 
can be chosen at pleasure. It is useful both in numerical integration and 
in numerical summation of series. 

As a particular case, we may let r—> ©, assuming that f(x) is such 
that the sum and the integral converge as r > ©, and that f and its first 
2m + 2 derivatives tend to zero as r—> ©. The resultant formula is 





then 
Ge aps iy BE oy ok 
Yaak. f(a) dz + 5 fo — tafe + 7H 40 — gooa0f0 + 
k=0 < 
Bam m— m— 
— my” 1fQmrD) te Kian, (5.8.15) 


where f, = f(a + kh). 
Thus, for f(z) = 1/2? and h = 1, we have 





ge ee es ee a) ee aed 
at@eit@epet Tat att bat 30a" + 
Bom t 
+ Gari + Em, (5.8.16) 


since here fe"-Y = —(2m)!/a*1. Whereas the series on the left con- 
verges rather slowly, the terms on the right decrease rapidly when a is 


fairly large. Thus, if we take a = 100, there follows 
1- dj -apraaked oe. = 107? 1 10-4 1 10-6 
i002 * i012 + i032 * 10 +5 X 10 +5 %x 


ick 10 
50 % 10 +H 
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and the retention of only the first three terms on the right gives 


1 1 1 
i002 * joi + To2 


correctly to 10 places. Nearly 2 X 10!° terms of the original series would 
be needed to supply this accuracy! ; 7 
It is of interest to notice that B2; was shown by Euler to be expressible 
in the form 
—1)-1(22)! 

B, = 2 peer, : a: + an ASS fon ) (i = 1). (5.8.17) 
Thus, since (27)! ultimately grows more rapidly than a* for any fixed a, 
it follows that, whereas B2; at first decreases with 7, ultimately Bz 
increases more rapidly than a” for any fixed a, as ¢ increases without 
limit. Hence it is evident that the result of omitting £,, in the right- 
hand member of (5.8.16) will not converge as m— «©. The expression 
(5.8.14) is of no use when r— ~.f However, a test, described below, 
shows indeed that here E,, decreases in magnitude until m is approx- 
imately equal to ra, after which it begins to increase unboundedly in 
magnitude and to oscillate in sign. In the case a = 100, this would 
mean that the retention of additional terms would continue to improve 
the approximation until more than 300 terms were taken. However, in 
the case a = 1, for which the left-hand member of (5.8.16) has the known 
value 72/6 = 1.64493, in accordance with (5.8.17), the right-hand mem- 
ber becomes 


L+e+d—dth— ata Meth te. 
Hlere the error E associated with the truncation of this series after n 
terms varies with n as is indicated in the accompanying 





+ +++ = 0.0100501667, 





table. m E 
This type of phenomenon, in which the successive mem- 1 0.645 
hors of a sequence of approximations first approach nearer 2 0.145 
and nearer to the desired result, and then begin to oscillate : er 
about it with ever-increasing amplitude, arises very fre- 5  _9 012 
quently in numerical analysis. Whereas such a situation 6 0.021 
ean often be brought about by prolonged propagation of 7 —0.055 
round-off errors (and is usually attributed to this cause by 8 0.198 
computers!), we have seen here, and in §§3.7 and 4.11, that fe lates 


it can also result from successively progressing to proce- 

dures of “higher-order accuracy,” and hence eventually using too many 

torms of a divergent (but asymptotic) series, even though it be assumed 
| Notice, however, that it merely becomes of indeterminate form since, as z;— ©, 


the upper range of permissible values of ¢ increases without bound, and f@™+» (£) + 0 
wile @, 
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that no round-off errors are introduced. Additional situations of this 
type will be encountered in other chapters. 

In the general summation formula of (5.8.13), it can be shown (see 
Steffensen [18]) that, if fer+? (a) and fort? (xz) do not change sign for 
ty <a < 2;, then E£,, is numerically smaller than the first neglected term 
and is of the same sign. This rule applies to the special case of (5.8.16). 
More generally, if it is known only that fet» does not change sign, then 
it can be shown that £,, is numerically smaller than twice the first neg- 
lected term and of the same sign (see Prob. 23 and Steffensen [18]). 

The fact that rules of this type apply rather frequently to interpolation 
series and to allied series makes the procedure of using the first omitted 
term as a basis for estimating the order of magnitude of the error some- 
what less hazardous in connection with such series than with convergent 
series more often encountered in other fields. 

A formula similar to (5.8.13), but summing instead the ordinates mid- 
way between the successive ordinates involved in (5.8.18), is sometimes 
called the second Eulerian sum formula (see Steffensen [18]) and is of the 
form 


r-1 m 
1 [” 1 — 2!-%) By,h2-} 
Y fins aa io f(x) dx — > 0 [fee — fe] + En 
i=l 


k=0 


(5.8.18) 


(1 = 27?) Bang sh?" promt (g), (5.8.19) 


(2m + 2)! 





where En = —? 








Whereas the first formula is useful for either numerical summation or 
numerical integration, the second is used chiefly for integration, in the 
form 


ip “ f@) dz = My tht + +S) +h fo 


Tht 


= preg ie = wry 4. +. (5.8.20) 


A comparison of (5.8.14) and (5.8.19) shows that the second formula 
is slightly more accurate than the first, on the average, when truncated 
with the same number of “correction terms.” 

Here, again, if fe"+” (x) and femt+# (x) are of constant sign in (20,2), 
the error is numerically smaller than the first neglected term and is of the 
same sign. If only f?+®(x) is known to be of constant sign, then Li» 
can be shown to be numerically smaller than three times the first neglected 
term and of the same sign. 

It may be seen that the “correction terms” in both (5.8.18) and (5.8.20) 
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all vanish if, say, f(x) is periodic, of period x, — xo, although the error 
term naturally remains (see, for example, Prob. 30 of Chap. 3).t 

5.9. Formulas of Gregory and Gauss. Euler’s Transformation. The 
Euler-Maclaurin sum formula (5.8.12) can be written in the form 


‘fl p(x) dx = h(epo+ pit pet:-:- + pea + 4p.) — (ph — mi) 


h* hé 
+ 720 Qe a) = 30240 (py — py) +--+ -, (5.9.1) 


for a polynomial, and hence can be considered as the trapezoidal rule 
with ‘‘correction terms” expressed in terms of derivatives of p(x) eval- 
uated at the ends of the interval. 

A more frequently useful formula is obtained if the derivatives at ao 
are expressed in terms of forward differences, by using (5.3.5), and if 
ee - ZX, are expressed in terms of backward differences, by using 

hp, = Apo — 3A*po + gA*po — zA*po + sA%po — - - - 
hp, = Vp + aV°pr + 3V*pr + aV'pr + 5Vipe + 
hepy’ = A®po — gA‘po + ZA*po — - - 
hip,” = Vipe + 3V‘pr + 2V Dr + +s 
hips = Apo — me ie ety 
RD ND e an, 
The result of this substitution is of the form 


? 
, 


d 


[ p(x) dx = h(gpo + pit po + ++ + + Pri t+ 3Dr) 


h h 19h 
— FT (Ver — Apo) — a3 (V2p, + A’po) — 730 (V3p, — A®po) 
_ 8h 863h 
160 (Vip, + A*po) — 60480 (Vp, — A'po) — +++, (5.9.2) 


and is known as Gregory’s formula. If no differences beyond the rth are 
retained, only values of the integrand in the interval of integration are 
involved. The error term associated with the substitution of f(x) for 
p(x), and subsequent truncation of the series, is rather complicated. 

On the other hand, if we replace the derivatives at x) and at x, in (5.9.1) 
by mean cenéral differences at those points, by use of the formulas 
obtained in §5.3, of the form 


hp’ = pip — gud'p + youd’ p — ++, 
hip!” = pip — qudp + ---, 
h'p. = udp ere bed Ge 


| For significant applications of these formulas in such cases, see Y. L. Luke, 


Mimple Formulas for the Evaluation of Some Hi i 
gher Transcendental Functions, 
J, Math, and Phys., $4:298-307 (1956), aa baad 


<< ll 
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we obtain the formula 


ar h 
p(x) da = h(gpo + pit pat - + Dra + 32) — 75 (udp, — ndPo) 
11h — astny — 291 (atm, — wd? 


This formula, associated with the name of Gauss, has the property that 
its leading coefficients decrease more rapidly than do those in Gregory’s 
formula. However, it has the disadvantage that it always involves 
values of the integrand which lie outside the range of integration. 

Tt is known (see Steffensen [18]) that the error term to be introduced 
in (5.9.3), when p(x) is replaced by f(x) and the series is truncated with 
differences of order 2k — 1, is obtained by replacing the contents of the 
parentheses in the first omitted term by rh*+3fek+) (£), where é lies between 
the extreme relevant values of x. 

Both these formulas can also be derived directly by operational 
methods (see Probs. 28 and 29). 

The formulas (5.9.1) to (5.9.3) are expressed in a form suitable for 
approximate evaluation of the relevant integral. When the integral can 
be evaluated otherwise, and the formulas are to be used instead, say, for 
approximate summation of an infinite series, they may be expressed in 


the form 


. if? L(fy — thf, + wbeh'fe” — - 2, 6.OG 
ye ae I f(a) dx + 4 (fo — AAfo + reA’fo — °° ); (5.9.5) 
k=0 = (fo — tudfo + seoHd'fo — °° ), (5.9.6) 


where fi, = f(xo + kh), when applied formally to a function f(x), under 
the assumptions that f(x) and its derivatives vanish as t— @, and that 
the series and integral are convergent. 

If the terms f;, are of constant sign and decrease slowly in magnitude, 
so that the given series converges slowly, the successive terms in the 
transformed series generally decrease rapidly in magnitude, at least up 
to a certain stage. Thus these series, while generally asymptotic, are 
often useful for calculation in such cases. However, since the right- 
hand member of (5.9.6) involves values of fx = f(to + kh) for negative 
k, it cannot be used if values needed are undefined, and it is inefficient if 
fr. changes rapidly with k when k is negative. 

These transformations are usually not useful when the terms in the 
given series fluctuate in sign. However, in those situations when the 
signs of successive terms steadily alternate, there exist more appropriate 
transformations, of similar type, which possess the additional advantage 
that their use does not involve the evaluation of an integral. Their 
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formal derivation is simpl aE 
relation imply effected by noticing that the operational 


P-Mmt+P—Pt-:-:- SF Q-ETE-—.---: nAeeS 
eT RP. 

| . (5.9.7) 

is valid for any polynomial p(x), and that we have also . as 


1 at hD 1 | 
—— _ = =| 1 — tanh = me 1) 
I+E 2 kan ) Fra7~a7 Type 698) 


Hence, by formally replacin i 
g p by f in (5.9.7) and expanding the o t 
1/(1 + E) in accordance with (5.9.8), we obtain the papers aie 





Ao — af + bY" — akohify + aovesh'fg! — «9; 
= , 4 5.9.9) 
> (—1)'f = a(fo — gAfo + tA%o — «© + (-—1)27 Ao + f . iy 
k=0 (5.9.10) 


a(fo — gu5fo + gud*fo— + +: 

+ (—1)"712-2-1 pd?rttfp +--+). (5.9.11) 
The second relation (5.9.10), expressed in terms of forward differen 
is often known as Euler’s transformation.t It is known (see Hard (102) 
that the transformed series in (5.9.10) will converge whenever the i : 
series does so, and to the same sum. Indeed, the transformed Eicon 
may converge when the parent series does not, in which case the sur ey 
the transformed series is often called the Euler sum of the parent s wes 

Che other two transformed series, (5.9.9) and (5.9.11), are ape ae 
asymptotic, but the rate of effective convergence of the ie ea i 
often more rapid. As in (5.9.6), terms on the right in (5.9.11) 
undefined after a certain stage. - vo 

In illustration, if only the first four terms of the series 


. 1 1 1 
S=1-=+.-- ara ath 2 ‘ 
gta 7 at + (—1)"t2 + + + (= log 2) (5.9.12) 


are summed initially, to give 


S=Ww+@-ati—-- 4, 


the use of (5.9.9) to (5.9.11), with f(x | 
: 9.11), =] = = 1, is fe 
(o give the relations fs a ae aaa ve 


1 1 
Yo + 100 — soon. t ess0s — 3801 
gat ; szso000 + (5.9.13) 
rz + vo + x30 + eto + ras0 + voteo + (5.9.14) 
Yo t+ ve — reer tater — eases t+, (5.9.15) 


{This transformation i P ; 
Chap. 1. mation is closely related to that considered in Probs. 7 and 8 of 
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after an appropriate tabulation and differencing of the ordinates fx in 
the last two cases. 

Retention of five terms of the transformed series in (5.9.13) yields an 
approximation to log 2 = 0.693147183 with an error smaller than 
6 X 10-7, whereas the same truncation of the Euler series (5.9.14) is in 
error by about 2 X 107°. If additional terms were retained in these two 
series, the second would continue to converge indefinitely, whereas the 
oscillation of the first series eventually (after about nine terms) would 
begin to increase unboundedly. A consideration of the series (5.9.15) 
shows that, whereas retention of successive terms through the fourth 
increases the accuracy of the approximation, until the error is decreased 
to about 9 X 10-°, the retention of the fifth “correction term” increases 
the error in magnitude. In addition, since the sixth term would intro- 
duce the ordinate f_s, which is infinite, the series is undefined beyond this 
stage in the present case. 

More efficient transformations would have been effected by summing 
more than four terms of the given series in advance. 

A useful variant of the Euler transformation (5.9.10), which also 
yields a convergent series when the parent series converges, is expressible 
in the form} 


y (-l)f. = 


k=0 


; y, etl AMo + a » (—1)Fat yf. (5.9.16) 
k=0 


k=0 


The right-hand member can be interpreted as the result of truncating 
the Euler formula with nth differences and expressing the error term as all 
infinite series of (n + 1)th differences. In particular, if A**'fr is of con= 
stant sign for k = 0 and tends steadily to zero as k > ©, we may dedu 
that the truncation error in the Euler formula (5.9.10) is smaller in 
magnitude than twice the first omitted term and is of the same sign. 
This situation will exist if fet? (x) is of constant sign when z 2 2o and if 
it tends steadily to zero as%—> ©. 

The Euler transformation is most efficient when the alternating seri 
fo —fi tide — °° * converges very slowly, so that f, tends to zero, say, 
like 1/k as k—> ©. When f, tends to zero, say, like rk (r < 1), so thay 
the series simulates an alternating geometric series, a useful generalizatio 
results from writing 


fe = gn, (5.9.17) 


where r may be identified, for example, with a representative value 


+ This formula can be obtained by operational mothods or, rigorously, by nf 
ations of the transformation considered in Prob, 7 of Chap. 1. 































alter which the transformation may be iterated. 
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Sisi/fx or with its limit as k > ©. 


159 
The formal symbolic relation 
fo — fit fe —fs+ eatin: (1 —E+E>—-E*+.-.- “fo 


then becomes 


fo —fi t+ fe —fs+ NTS 


(1 — rE + r°E? — Et + - + +)go 


ey ae 1 
~TfrE”” GFH Fra" 
and yields the formula (l+r)+rd 


1 2 
(1 = Sy Ae eS 
k=0 Ie 1+r ut T+r°™ 1l+r ss aa 


| (5.9.18) 
which reduces to (5.9.10) when r is taken to be unity. For any fixed 


r > 0, the right-h i 
a ig and member will converge when the left-hand member 





Other generalizations of a similar nature are readily devised. Thus, 
if we write fi, = c.g, we may derive the formal relation 


«o 


y (Di = FO a+ 52 
k o(1) + A+ 3 A’?+:: | go (fe = CeGe), 


heO 1! 2! 
where ¢(x) is the function possessing the expansion as 
i a Or a (5.9.20) 


k=0 


when |z| is small. Here c, is to be determined so that g, tends to vary 


slowly with in i ; 3 so: : 
res hae creasing k and, desirably, so that ¢(z) is identifiable in 
A related class of transformations, which frequently accelerate the 


vonvergence of alternating series, deals directly with 
partial sums S;, such that ai ce 


RMA rho jer 2s oy (5.9.21) 
and replaces the sequence So, Si 8 b 
1), Ti, .. . , Te... , where Rey eee een gE 
Si + wiSr-1 + ++: 
T, = Wore r-1 + wSo 
hg eo (5.9.22) 
with a suitable definition of the weighting coefficients wo, w, 1 We 


It is known (see Hardy 
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[102]) that the 7 sequence will converge to the same limit as does the S 
sequence if the conditions 


: Wk ee 
wz0ersh, lima oy stm? 


(5.9.23) 





wo > 0, 


are satisfied. 
The choices w) = W1 = ~**° = We = 1 and 


_ = Wy = 


are most often used, the latter often being particularly efficient (when the 
f’s are positive), and they are associated with the names of Cesaro and 
Hutton, respectively. 

5.10. Special Integration Formulas. By starting with any integration 
formula, we may form a new one by adding a multiple of a difference, 
say 6?"fo, and subtracting the same multiple of h?*f2"(£), where § 1s between 
ay — rh and xo + rh, from the remainder term. Two rather well known > 
formulas, which may be derived in this way, are next considered briefly. 

The Newton-Cotes seven-point formula (3.5.15) can be written in the 
form 


zo+3h i 
| f(x) dz = its (41f_3 + 216f_2 + Q7f_y + 272fo + 27f: + 216f2 
zo— 3h 3 


OW 5 
+ 4lfs) — apg FE)», (6.10.1) 


where x) — 3h < £1 <0 + 3h. In addition, we have the relation 


eee a 
140 58fy = 140 (f-s — 6f-2 + 15f_1 — 20fo + 15f1 6fo + fs). (5.10.2) 


If (5.10.2) is added to (5.10.1), and 6%fy is replaced by h°f”'(g2) in th 
result, the formula 


awo+3h 3h 
| _, fa) de = Fy (La + fa tha + Bf + hi + fe + fe) 


ro— hi 
1400 
known as Weddle’s rule, is obtained. It is notable chiefly because of thi 

simplicity of its coefficients, but can be used only when the interval 
integration can be conveniently divided into six subintervals. Also, i 
accuracy is usually inferior to that of (5.10.1), and its error term is mo 

complicated. ‘ 

If the ordinates f_, and f, are eliminated between (5.10.1) and (5.10.2) 
by subtracting } times the latter from the former, and 65%fo is 


[10f'(Es) + Oh2fr(E,)], (5.10.8) 


that A*/v4l is small relative to fv, 
large relative to fv', as may happen when f(x) possesses a singularity in the complex 
plane, near the path of integration, these conclusions may not follow. In such cases, 
the use of a simpler composite formula, such as that of the parabolic rule, may yield 
hotter results than any of these more elaborate formulas (see §3.7). 
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replaced by h®f*i(é), the formula 


co +3h 
/ F).ae = ne (28s + 162f-2 + 220f> + 162f2 + 28f:) 


zo—3h 
s 00 [2f"(E2) — Wf(E)], (5.10.4) 


known as Hardy’s rule, is obtained. For equal integration ranges, its 
accuracy is usually somewhat superior to the Newton-Cotes five-point 
formula, but it again requires ordinates which divide the range into six 
equal parts. t 

As a further example, suppose that values of f(x + h) and f(a + 2h) 
are obtained by observation, but that the value of f(x») cannot be meas- 
ured, and that the integral of f(x) over (xo — 2h, xo + 2h) is to be approx- 
imated. To obtain an appropriate formula, we may write down the 
Newton-Cotes five-point formula in the form 


/ Oe a) de = 2” (Tf 4 + 32-1 + 1% + 32fr + Tf) — Be pw 
. I x = a5 (if—2 + fr + 12fo + 32f1 + fe) — og f'(E), 


J 20-2 
(5.10.5) 
and note that 
2h 2h 
15 5'fo = FB (fe — 4f_1 + 6fo — 4f:1 + fo). (5.10.6) 


The ordinate fo is then eliminated by subtracting twice the second expres- 
sion from the first, and replacing 54f) by h‘fi"(é2), to give the formula 


i 2h ARS oo yee 
lf f(x) dx = 9 (f-2 + 8f-1 + 8fi + fe) + 945 [21fi”(é2) — 2h°f"'(é1)]. 
J to—2h . 


(5.10.7) 
‘he same formula, with the error term expressed in the somewhat less 
tractable form 
1 fT ARG og 2) (2 2) fi } 
ge [7 a? — h(a? — 4h VF(B de, 


would be obtained by determining the Lagrange interpolation polynomial 
which agrees with f(x) at the four relevant points, and integrating it over 
(he given integral. 

§.11. Error Terms in Integration Formulas. This section presents 
methods of obtaining expressions for the error term to be inserted in a 


| The error comparisons with Newton-Cotes formulas are based on the supposition 
In those cases when h?fviii is comparable with or 
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formula for numerical integration, obtained (by operational methods or 
otherwise) in such a way that it reduces to an identity when applied to a 
“polynomial of sufficiently low degree, in those cases when the formula is 
applied to a function of more general type. The methods are readily 
modified to the consideration of formulas for interpolation or for numer- 
ical differentiation as well. 

For present purposes, it is convenient to suppose that the formula is 
expressed explicitly in terms of ordinates, rather than differences or 
divided differences. Also, in order to include a class of formulas to be 
developed in Chap. 8, as well as those so far considered, we suppose that 
the formula is of the rather general form 


I w(2)f(x) dx = y Wif (ax) + R, (5.11.1) 
k=0 


where w(x) is a prescribed weighting function, which is wntty in the for- 
mulas so far considered and which is nonnegative in (a,b) in most other 
applications; where Xo, 41, . . - » % aren + 1 abscissas, not necessarily 
equally spaced; and where Wo, Wi, ..-, Wa are the corresponding 
so-called weighting coefficients. 

It is supposed that the required error R is zero when f(x) is any poly- 
nomial of degree N or less. If also R is not zero when f(x) is a polynomial 
of degree N + 1, then N is called the degree of precision of the integration 
formula. However, we suppose here only that the degree of precision i 
at least N, where N is a known positive integer. Wealso assume explicitly 
that w(x) = 0 in (a,b). 

We may transpose Eq. (5.11.1) into the form 


REf(2)] = f° w@f@) dz — Y Wiles), (5.11.2) 
k=0 


where the notation R[f(x)] is used to indicate that the operation involved 
in the right-hand member has been effected on f(z). Our hypothesi 
therefore, is that 

R[z’™] = 0 


(r =0,1,2,...-,). (5.11.8) 


In order to treat situations in which some of the abscissas lie outside t 
integration range (a,b), we suppose that the abscissas are ordered 
increasing algebraic order and denote the smaller of xo and a by A & 
the larger of x, and b by B, so that all relevant values of z lie in 
interval J = (A,B). Attention is restricted to those functions w 
possess N + 1 continuous derivatives in (A,B). 
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Then, for any values of x and in (A,B), we can write 
fe) =f) +P @-9+ OP @-at 
f(Z) f (N+1) 
+o @ — a + oe (c — z)"#1, (5.11.4) 


where, for any fixed Z in J, — depends upon 2, but lies in (A,B). Since 
the first N + 1 terms in the right-hand member comprise a polynomial 
of degree N, which is annthilated by the operator in (5.11.2), the error 
R[f(x)] is the same as the error term corresponding to the memmammdee term 


(N-+1) 
Ravin aren (c — #)¥H, (5.11.5) 
and hence 
(NV + DIRY@] = [? wae — |YHPe~® dx 
— ¥ Wel — ayeryorsn(g), (6.11.6) 


k=0 
where &, &, &1, . . . , &n all lie inside (A,B). 

T his form of the error term is generally not a very useful one. How- 
over, if we denote the maximum value of |f“+(z)| in (A,B) by M, and 


notice that |z — Z| S$ (B — A)/2 in (A,B) when % = (A : 
mits the crude estimate (A + B)/2, it per- 


MLY+1 b . 
|R| < QvH1(N + 1)! lf w(a) dx + » imal | (5.11.7) 
k=0 
where L=B- A, |f"+D(x)| S M in (A,B). (5.11.8) 
Since R = 0 in (5.11.1) when f(z) = 1, there follows 
b n 
f w(x) dx = ‘ We. (5.11.9) 
k=0 


Hlence, in those cases when all the weights W; are nonnegative, the error 
hound (5.11.7) can be expressed in the simpler form 
MLY+1 


b 
emt w(x) dz, 


ere b — a when none of the abscissas lies outside (a,b). 
Chis error bound, while of simple form, is often extremely conservative. 


In order to obtain a more useful form, we m : 
‘ ay replace the r 
(5.11.5) by the integral form f » emainder 


|R| = (5.11.10) 


where L = 


Ey(2) = w1 [ (a — s)¥f+0(s) da, (5.11.11) 
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which possesses the advantage that no unknown parameter, correspond- 
ing to the £ in (5.11.5), appears (see §1.7). If we identify Z with A, the 
relation (5.11.6) is then replaced by the form 


NIRL(@)) = [? we) [f @ — 9)*f"4P(s) ds de 
= Wi [™ (xe — s)*f%+0(s) ds. (5.11.12) 
2 ii : 


~ In order to express this result in more convenient form, it is useful to 
introduce the notation 


x — s)k when x > 8, 
(e—)b= i oe (5.11.13) 


in accordance with which (5.11.12) can be written in the form 


NIRISa)] = f° w(e) f° @ — f(s) ds de 
. B 

—~ Voom [? (me — sxfetn(s) ds. (5-11.14) 

2M 


Since the integration limits are now constant, the order of integration i 
readily reversed, to give 


Nini = frre [ f? @ — sitwG) de — Wales — sy] ds 
or, equivalently, 
Rif(a)] = fP Efe” (s) as, (5.11.15 


where G(s) is defined by the equation 
NIG(s) = [ (a — s)$w(a) dz — ) Wilt — 9%, (6-11.16 
‘: k=0 


and may be called the influence function (or ‘‘kernel function”) for t 
integration formula (5.11.1), relevant to N.f{ ; 

It is useful to notice that G(s) can be considered as the error in (5.11. 
when f(x) is identified with (x — s)¥/N!. The definition can also 
expressed in the more explicit form 


p (x — s)%w(x) dx (s S$ a) 
NIG(s) = [ (x — s)¥w(x) dx (a<ssb(~ By Wi(xe — 8)%, 
0 (s > b) thee 


(5.11.1 


+ This form appears to be due to Peano and Rémés, See Sard [205] and M 
[10, 153}. 
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where the notation in the right-hand member indicates that the sum is 
to be taken over those values of k for which x, 2 s. It is easily seen that 
G(s) vanishes for all values of s outside the interval (A,B) over which the 
integration is effected in (5.11.15). 

In illustration, we consider the simple integration formula 


[[1@ dz =f) +f(-a) +R OSaS1), (6.1118) 


where a is a fixed constant. It is seen that R = 0 for f(z) = 1 and for 
f(z) = x, but that R ~ 0 for f(x) = x? unless a? = 3. Thus we have 
always N = 1, and also N > 1 whena? = 4. Here 


(A,B) = (a,b) = (—1,1) 
and w(x) = 1. The use of (5.11.16) or (5.11.17) gives 


1 
1!G(s) = is (a — s), dx — (—a — 8), — (a — 8)4 


— «¢)2 |} 
St ee eC 
1 
= / (a — s) dx — » (x, — 8), (5.11.19) 
. Tk28 
when |s| $ 1, so that 
G (1 =< s)? 
Se Se) (5.11.20) 
Lk2s 
where x» = —a and x; = +a. Hence there follows 
G9" (-a- 8) -(@— 2) = OE) (i ges a), 
G(s) = Ga — (@— 9) = STU 20) (—a $8 Sa), 
aati («S25 1), 
(5.11.21) 


and, with G(s) so defined, the error R in (5.11.18) can be expressed in 
the form 


R= [* G(s\f"s) ds. (5.11.22) 


We may notice that this function G(s) is made up of the arcs of three 
purabolas which join continuously at the transition points, coinciding 
with the abscissas employed in (5.11.18). However, the slope G’(s) 
decreases abruptly by unity as each such point is crossed in the positive 
direction, Also, in each subinterval we have G’’(s) = 1, 
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If a < 4, G(s) vanishes only at the ends s = +1, and at s = 0 in the 
special case a = }, and is otherwise positive in (—1,1). Hence, in this 
case, the second law of the mean may be invoked to permit (5.11.22) to 
be written in the form 

= 3a? 


ras [40a = ro 


where || <1. When = 1, there follows merely G(s) = (s? — 1)/2 for 
—~1<s<1. In this case G(s) is negative throughout the interior of the 
interval, so that the law of the mean again can be applied, and (5.11.28) 
also holds in this case, 


(0SaS3%), (5.11.23) 


st— 1 


1 
R=") [, sds = -3f"( 


If 4 <a <1, G(s) changes sign at s = + a/2a — 1, and (5.11.22) can- 
not be transformed in this way. However, in any case it can be deduced 
that 





(a =1). (5.11.24) 


RLS Uf" na he |G(s)| ds. (5.11.25) 

In the special case in which a = 4/3/3 in (5.11.18), R vanishes also 
for f(x) = x? and for f(x) = 2°, but does not vanish for f(z) = a‘. Hence 
the degree of precision is then three, and we may obtain a more useful 
formula by taking N = 3, in accordance with which 


(1 + s)! (-1 Ss —a), 
24G(s) = { st + 6(1 — 2a)s? + (1 — 4a%) (-a Ss Sa), (5.11.26) 
(1 — s)4 (asl), 


where a = */3/3. It is easily verified that G(s) is continuous and that 
it vanishes only at the ends of the interval, so that the second law of the 
mean may be invoked to give 


R = fr(é) [,&) ds = risf"(8), 


and hence there follows 


‘ _;(_v3 (ee 1 fiv 5.11.27) 
[sea =3( MB) +5 V3) + xisf (),  ¢ ) 


where |é| < 1.t . 
This example may serve to indicate the use of the influence function 
in other cases. From the definition (5.11.17), it is easily seen that G(s) 


+ This remarkable formula is a member of the class of so-called Gaussian quadra 


formulas, to be considered in §8.5, a8 well as the class of Chebyshev quadrature fo 
mulas, to be treated in §8.13, 
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and its first N — 1 derivatives are continuous at the transition points 
and that they all vanish at the end points, = A and zx = B, of the inter- 
val of integration in (5.11.15). Further, it is found from (5.11.17) that 


(—1)"GO(s) = [? we) dx — > W 


tk28 


(5.11.28) 


and GXtD(s) = (—1)*tw(s) (5.11.29) 


in each subinterval, with the convention that w(x) is to be taken as 
zero when x is outside (a,b) in both (5.11.28) and (5.11.29). Thus, 
(—1)"@(s) increases abruptly by W; as s increases through the ith 
abscissa, but is continuous inside each subinterval. 

It may be seen that, if G(s) does not change sign in (A,B), the use of 
the second law of the mean shows that (5.11.15) is expressible in the form 


R[f(x)] = KF%* (E) 


where K is independent of f(z). 
there follows 


(A <é<B), (5.11.30) 


In particular, if we take f(z) = 2+} 


R[2¥+1] = (N + 1)!K. 
Thus K is determined, and, from (5.11.30), we deduce that 


RIO) = Ga 


if G(s) does not change sign in (A,B). 
In illustration, we have seen that the G(s) associated with (5.11.18) 


does not change sign in the cases when 0 S$ a S § or when a = 1, and 
that then N = 1. 


Thus, in place of evaluating the integral involved in 


(5.11.23) tn those cases, we can use (5.11.31) to obtain the same result 
more easily: 


7 . 1 — 8a2 
Rista)] =P | [ards — at - (—a)*| = 1" prep, 


However, the initial labor of determining G(s) and actually investigat- 
ing whether or not it changes sign in (A,B) may be appreciable when N 
is moderately large. The preceding simple example shows that the 
requirement that the weights W; be positive is not sufficient to guarantee 
that G(s) will be of constant sign. 

A third form of the error term, complementing the alternatives (5.11.6) 
wnd (5.11.15), can be obtained by replacing f(x) by the sum of the poly- 
nomial y,(@), which agrees with it at the n + 1 points 20, 1, . En 


involved in the integration formula, and the appropriate remainder term 
(2.6.1), so that we write 


Se) = ynle) + e(e)f[xo,a1, « 


R[x] (5.11.31) 


. jtnj%), 


(5.11.32) 
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where, as before, 
| x(a) = (w — m0)(w — a1) ++ + (© — tH). (5.11.33) 


If we suppose that the degree of precision of (5.11.1) is at least equal 
to n, as is true for most of the useful formulas, the polynomial Yn(x) is 
annihilated by the operator in (5.11.2). Since also the remainder term 
in (5.11.32) vanishes when x = 2;, for 7 = 0, 1, rey My there follows 
simply 

R[f(z)] = if ° ww(z)m(a)flitear, . . » tnx] dz. (5.11.34) 


In many cases, there exists a function V(x) such that 





w(a)e(2) = aVo), (5.11.35) 


where V(x) and its first r — 1 derivatives vanish for both x =a and 
x = b, for some positive integer r. Under this assumption, the result 
of integrating (5.11.34) by parts r times is seen to be 


ae © 
R[f(x)] = (-1)" i V(x) Fe fizo, - - - Pn a] dx 
and, after making use of (2.3.9) and (3.3.14), combined in the form 


“i — 7 _ pintrto(n), (5.11.86 
qa {tot es ed 8 Un] = (n + r + nit (n), (5 ) 
where 7 is interior to the interval limited by the largest and smallest of 
the n + 2 arguments on the left, there follows 
oe | ° Veneto) dz. (5.11.87) 
R[f(x)] = @+r+)! V(a)f (n) ( 


If also V(x) is of constant sign in (a,b), this result can be further sim- 
plified to the form 


Rif(a)] = wen a i V(2).de, (5.11.38) 





where £ lies between the smaller of a and x» and the larger of b and Tne 
In addition, by integrating by parts r times, and again making use of 


(5.11.35) and of the assumed properties of V(x), we find that 
. ( oe 1)" i (r) 
| V(x) dx = re [ar + Upa(a)]V™ (x) dx 
a a 


m= re fe [a + Uy. (a))w(w)ar(x) dex, 
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where u,_1(x) is an arbitrary polynomial of degree r — 1 or less. Hence 
(5.11.38) is also expressible in the equivalent form 


= fimtr+n (£) / b . ? rl 
[f(x)] = (n+r+D! : x ‘w (aa (a) x, (5. 1.39) 
where x" can be replaced by any convenient polynomial of degree r in 
which the coefficient of x” is unity, if so desired. 

This result will be of particular usefulness in Chap. 8. In the case of 
the formula (5.11.18), it is found that 


w(a)r(a) = a2 — at = 2 iat — ate + ( — a), 
where the constant of integration is determined so that the function in 
brackets vanishes when x = —1. That function will also vanish when 
v = +1 if a = 3, in which case there follows further 


w(z)r(a) = Sel — 244), 


so that we may take V(x) = (1 — x?)?/12 in that case. The use of 
(5.11.38) or (5.11.39), with n = 1 and r = 2, leads again to the result 
given in (5.11.27)... . 5 beg as 
It may be noticed that, éf (5.11.38) or (5.11.39) is valid, the degree of 
precision of the relevant integration formula is n + r. * % ahh 
In order to express in a different form the conditions permitting the 
se of (5.11.88) or (5.11.39), we may make use of Theorem 12 of $1.7 to 


show that, if V(x) = w(x)r(x) and if V, V’,... , Ve» vanish at 
« = a, there follows ; 

V(x) = som. (x — s)"'w(s)r(s) ds (5.11.40) 
und also the requirements that V, V’, . . . , V- also vanish at x = b 
luke the form 


i (b — s)'w(s)r(s)ds =O (k =0,1,2,...,7—1). (5.11.41) 


lurther, if we assume that the degree of precision of (5,11.1) is n + #, 
where r 2 1, it follows that the right-hand member of (5:11.34) will 
vanish when f(x) is any polynomial of degree n + r or less, or, equiva- 
lontly, when the divided difference ffao,a1, .. . ,n,x], of order n + 1, is 
ily polynomial of degree r — 1 or less. But this situation implies the 
truth of (5.11.41), wif 


Hence we may deduce that if the degree of precision of the integration 
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formula (5.11.1) isn + 17, where r 2 1, and af the function 


V(z) = Gon [ ’ (x — s)*—!w(s)x(s) ds 


does not change sign in (a,b), then the error R is given by (5.11.38) or 
(5.11.39). 

5.12. Other Representations of Error Formulas. If the degree of 
precision of (5.11.1) is exactly n, where n + 1 ordinates are used, the 
function V(x) defined by (5.11.35) will not vanish at both ends of the 
interval (a,b) when r = 1, so that (5.11.38) and (5.11.39) then are not 
valid. Whereas the use of the G function of the preceding section gen- 
erally involves the individual consideration of each of the ranges (21,241), 
and whereas the vanishing of x(a) at each abscissa x, would require the 
same subdivision of (a,b) before the second law of the mean could be 
used in connection with (5.11.34), it may be possible to define V functions 
which are appropriate to subintervals comprising several such ranges, and 
so to obtain a more useful form of the remainder with decreased labor. 

In illustration, the formula approximating the integral of f(x) over 
(0,3) in terms of the three ordinates at x = 0, 1, 2, with w(x) = 1, would 
possess the error term 

R= i x(2)f[0,1,2,2] de, 


a(2) = x(x — 1)(a@ — 2), (5.12.1) 


if its degree of precision were at least two, by (5.11.34). Here we have 
a(x) = 28 — 322 + Qe = F(a — 4a + 4a?)' = 4[x2(x — 2), 


so that the function V(x) = 22(a — 2)2?/4 is appropriate for the sub- 
interval (0,2). In the remaining subinterval (2,3), r(x) does not change 
sign. Hence we may deduce that 


R 


ll 


= i : V (x)f[0,1,2,2,2] dx + [ ; a(x) f[0,1,2,2] da 


Ve 2 aa 3 
= ~ £te) f V¢e) de + 2) f n(x) dx 
—sofi’(&:) + af” (&), 


where both é, and £2 lie inside (0,3). 
In other cases, the function Q(x) defined by the relations 


ll 


Q' (x) = w(x)r(x) | 





a Q(Ax) = 0 
or, equivalently, Q(a) = ii wr@ dt, 
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where ty is one of the abscissas, may have the property that it does not 
change sign over the subinterval (A;z,2,) of the range (a,b), when A, is 
suitably chosen. In view of the identity 


(a — az)flzo, . . - jth. - 


= f[zo, ... tnx] — f[to, . . . Xn], (5.12.5) 
where the second term on the right is independent of x, we can write 


- ,Xn,2] 
yUk—-1yUk4+1) + + « 


if w(x)r(x)f[xo, . . . tn] dx 
= ie Q’ (x) {flzo, . . - ,Xe-1,0ep1, - - » ,€n,t] — flo, .. . i) dx 
= [Q@) (flo, . . eaters « «© nyt] — flo, . . tal bat 
— [2 Q@flto, .. . tater «+ + tant] de, 
(5.12.6) 


after an integration by parts. Now Q(x) vanishes when x = A,, and 
its coefficient in the integrated term vanishes when x = zx. Since also 
Q(x) is assumed not to change sign in (Az,,x,), the second law of the 
mean is applicable to the second term, and there follows 


Lk (n+ ek 
[ w(x)a(x)f[x0, . . . tn, a] dx = -£ a iF Q(x) dx. (5.12.7) 
Also, if we notice that [Q(a) dx = [Q(x) d(x — 2), and integrate by 


parts, there follows 
[ Q(x) dx = [(e — x) Q(x) ]% — a (t — x%)Q’(x) dx 
- / a w(x)r(x) dx, 


Ak 
no that (5.12.7) becomes 





ll 


ll 


Un,t] dx = on i 7 w(x)r(a) da. 





[- w(x)a(x)f[xo, .. - (5.12.8) 

Thus, in spite of the fact that r() may change sign in (A;,«;), it follows 
(hat the result of formally applying the law of the mean to the left-hand 
member of (5.12.8), and then using (5.11.36), with r = 0, yields a correct 
rosult when the function Q(x) defined by (5.12.3) or (5.12.4) does not change 
wign in (Ax, tx). 

We may notice also that if, instead, Q(x) does not change sign between 
w = A, and w = B,, and if Q(B.) = 0, there follows also 


By iad Br 
im w(x)m(a)f[to, . . fri As 


by a slight modification of the same argument. 


Un 0] da = w(x)r(x) dz, (5.12.9) 
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As a first example, we notice that the error term relevant to the 
Newton-Cotes four-point formula of closed type with h = 1, 


[2 He dx = HO) + FQ) + 392 +B] +B, 6.12.10) 


is of ia form 
R= f x(x)f[0,1,2,3,2] dx, (x) = 2(a — 1)(e — 2)(@— 8). (5.12.11) 


Here the use of the function V(x) is found to be inappropriate. How- 
ever, we find that 


FE) = Hore — 2), 


so that the function Q(x) = «?(a — 2)?/4, corresponding to the choice 
A, = 0 in (5.12.3), is nonnegative for 0 Se 53 (as well as for all 
other real values of x). Hence (5.12.8) applies, with A, = 0=a and 
2, = 3 = b, and it yields 


R= it x(a) dx = —#of*(6), (5.12.12) 
0 


in accordance with (3.5.12). 


As a second example, we consider the Newton-Cotes two-point formul 
of open type with h = 1, 


fP 1@ az = HG) + IAI + 8, (5.12.13) 


for which we may write 


R= i w(x) f[1,2,a] dx, r(x) = («© — 1) — 2). (5.12, 14) 


Again the use of V(x) is inappropriate. However, we have 


7) =xz—1 =4[x(a — 2)J, 


corresponding to the choice a, = 2, A, = 0 in (5.12.3), so that (5.12.8 
applies over (0,2). Since x(x) does not change sign in (2,3), we may wri 
wt 2 7 3 
r= 2) [| 5) de + I r(x) de 

2! 0 2! 2 
= af'(E) + al’ (E2) 


and, since the numerical coefficients are of the same sign, we may combi 
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the terms in the form 


R = 2f" (8), (5.12.15) 
in accordance with (3.5.18). 

The V and Q methods, when applicable, are usually considerably more 
convenient than the more general G method of §5.11, which generally 
entails the determination and analysis of n or more distinct functions 
[each a polynomial of degree N + 1 if w(x) = 1] when n + 1 ordinates 
are involved. However, it must be noticed that the V and Q methods - 
are not applicable in those cases when the degree of precision of the inte- 
gration formula is less than n. 

Formulas which involve values of certain derivatives of f(x) as well as 
the value of f(x) itself, at certain points, may be considered as limits of 
formulas in which r + 1 abscissas coalesce into a single abscissa, corre- 
sponding to which the values of f, f’, . . . , and f® areused. Thus, for 


example, if the coefficients Wo, Wi, W2, and C, are determined in such a 
way that the formula 


[5 w@F@) de ~ Wof(—1) + WifO) + Waf(l) + Cxf'O) (5.12.16) 


is exact for f(z) = 1, x, 22, and 23, and so for any polynomial of degree 
three or less, the error term will be of the form 


R= ie w(x)(e + 1)a*(a — 1)f[—1,0,0,1,2] dz. (5.12.17) 


Ilere the second law of the mean applies directly and gives the simpler 
result 





* 1 
r=! e ie wae Side (5.12.18) 
which yields 
R= —hf*(6) (5.12.19) 


in the special case w(x) = 1. 
Ilowever, for the formula 


[1 I@ dx ~ Wes(-1) + WO + Wf) + CY"), (5.12.20) 


with the weighting coefficients determined by the same requirements, 
there follows 


R= [! + xe -Y¥-1,0,1,1,2]dz, (6.12.21) 


{The same methods apply, in particular, to all Newton-Cotes formulas which 


fmploy an even number of ordinates, whereas the V method succeeds when an odd 


Humber of ordinates is used, The methods are based on analyses given by Steffensen 
W those cases, 





174 
and, since here (x) changes sign at x = 0, another approach is needed. 
Since also the function i a(t) dt does not vanish when x = 1, the V 
method fails. On the other hand, since 
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we) = x(x? — 1) = (a — 1), 


the function Q(x) = (x? — 1)?/4 is appropriate with A; = 
and Eq. (5.12.8) gives 





iy 1 
ee @) a(x) dx = —yof'(é). 


a fe (5.12.22) 


The fact that (5.12.19) and (5.12.22) are both identical with the error 
term relevant to Simpson’s rule (for which Wo = W2 = 3, Wi = 4, and 
C, = 0 or C2 = 0) suggests that both (5.12.16) and (5.12.20) will reduc 
to Simpson’s rule in the case w(x) = 1, when the weights are determin 
in such a way that the degree of precision is at least three, that is, tha 
the weights C; and C, will be required to vanish. A direct derivatio 
will confirm this suspicion. 

The direct derivation of the error formula relevant to Simpson’s rule 
itself, over (—1,1), is effected most easily by the V method, since here 


R= [2 x@f-1,0,1,0) dx 


where a(x) = x(z? — 1) = 3[(2? — 1)?]' = V'(2). 


Thus there follows 


‘iv 1 
peat | (x? — 1)f[—1,0,1,2,2] dx = af @ 


—vof(é). 

5.13. Supplementary References. The use of symbolic methods esse 
tially dates from Boole [50]. See also Steffensen [18], Michel [149], a 
Bickley [39]. For Comrie’s method of ‘‘bridging differences” in su 
tabulation, see Hartree [2]. The polynomials and numbers of Bernou 
Euler, and Stirling are treated in Fort [82], where collateral referend 
are given. Hardy [102] treats summation of divergent series; for rec 
accounts of methods for accelerating the convergence of series, see Sz 
[219], Cherry [55], and Rosser [198]. Many series whose sums are kno 
are listed by Jolley [125]. General expressions for remainder formulas 
given by Peano [177, 178], Rémés [193], Sard [205], and Milne [1 
See also Birkhoff [43], von Mises [157], Radon [188], Daniell [67], 
Householder [8]. 





(x? — 1)? da 


ll 
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PROBLEMS 
Section 5.2 
1. Obtain the formal relations 
2+A 2-V 





=i(BI4E-4 = a S/T a in 
u =( ) 2Vi+A 2V1—V¥ V1 + <8? 


and construct a table expressing each of the operators E, A, V, 5, and p similarly in 
terms of each of the operators E, A, V, and 6. 
2. Establish the relations 
A=Ev, V=E-14, 


EA = Ely = 4, 
nb = a(a + VY), 


E+? =p + 36, Bw 
Section 5.3 


3. Express each of the operators E, A, V, 6, u, and ud in terms of AD. 

4. Express the operator h~1J in terms of E, A, V, 5, and AD. 

5. Show that the interpolation formulas of Stirling, Bessel, and Everett can be 
obtained operationally by rewriting the relation E* = e*D in the forms 


E* = cosh shD + ae aL 
cosh ghD °’ 
h thD 
Rw Be | OS : 
(a tnp “ + sinh thD ) Et, 
and + = Sinh sD | sinh (1 — s)AD 
. # sinh hD Bit sinh kD 


respectively, and expanding the right-hand members in inh 4 

; powers of 6 = 2 sinh shD b 
\ining the results of Probs. 31 and 32 of Chap. 4 with a replaced by AD and x be s or : 
Why would the corresponding expansion of the simpler relation 


E* = e*D = cosh shD + sinh shD 
he of limited usefulness? 


6. From the following rounded values of the function f(z) = sin 2, calculate approxi- 


inate values of f’(x) and f’’(x) at each tabular point and i 
rounded true values: Sa ea 


«| 0.5 


J(x) 


0.7 0.9 1.1 1.3 1.5 1.7 





0.47943 0.64422 0.78333 0.89121 0.96356 0.99749 0.99166 


Nection 5.4 
7, Using the data of Prob. 6, calculate the approximate value of f ‘ S(x) dx for 
0.5 


# » 0.7, 0.9, and 1,1, and the approximate value of red f(x) dx for x = 1.1, 1.3, and 
x 


1.5, From these results determine approximate values of the integral taken over each 


tabular interval, 
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Section 5.5 7 
8. Using the data of Prob. 6, calculate approximate values of the quantities 


ik . te f(t) dt dz, I ii f(t) dt de? 


if i = (1). = 1, calculate approximate 
9. If F(x) = log tan x and if F(1) =0 and F’( . 
values of F(x) for + = 1.00(0.02) (1.10), using only tabulated five-place values of 


log tan x [= (log 10) (logis tan z)] forz 21. 
“10. Show that, if the operator 6 is defined by the relation 


ie ie p(t) dtdz = 6p,, 
ik 


tk 


then . 





2 
—1-—rhD ( +a)yr —1—r log (1 +4) [ A |; 
= E ue r ee val oc wea 
and determine the first three coefficients in the expansion of the operator 6 in powers of 


A, as functions of r. 
"44. Show that the right-hand member of the result of operating on the equal 


is i t of V3 if and only if a1 = —1, 
embers of (5.5.12) by 1 + aiV + a:V? is independen j 
tat the result is equivalent to (5.5.11) if also a2 = 0, and that a particularly con- 


venient choice is that for which a2 = %, leading to the formula 


a” 
Prat — Pe — Peo + Pea = 3h(1 —V + egV? + OV? + 7a90V! + 7z0Vo + Pye 


) 


(This formula is used in §6.12.) 


Section 5.6 
12. Using the data of Prob. 6, calculate approximate values of 


| eae fla) 43 


1.1—0.2m 


for m = 1, 2, and 3. ; 
18. Derive the operational relation 


p(x) dx = hu 


ato-bh tanh $hD 
ie ghD 


and obtain the expansion in powers of 6 in the form 
i eth on) de = hu — vend? + teed — woreond® + - * “1D, 
xo 


14. Use the result of Prob. 13 and the data of Prob. 6 to calculate approxima! 


+0.2 
values of the integral f f(a) dz for x = 0.9 and 1.1. 
Section 5.7 
15. Subtabulate the data of Prob. 6 for z = 0.50(0.02)0.70 ands = 1.50(0.02) 
16. If 6’ represents the central-difference operator relative to the spacing h' = 
show that 
8’ sinh gehD 
5 ~ sinh $hD 
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and obtain the expansion of the right-hand member in powers of 6 = 2 sinh shD (see 
Prob. 32 of Chap. 4, with zx = p/2,a = hD, and 6 = 5), thus deducing the relation 


»-9,/$ Lae (5) Joa = (5) - ---}: 
. - [5 Bie Na ge eae ae 
Show also that . 

ps ~— sinh hD 


and obtain the expansion of the right-hand member in powers of 6 (see Prob. 31 of 
Chap. 4, with z = p, a = hD, and B = 8), thus deducing the relation 


(us)! = pd! = o| ua = ui (12 — p2)ud3 + 3 (1? — p*)(2? — p?)ws5 — -- - | 


17. In the case of subtabulation to tenths (p = To); deduce from the results of 
Prob. 16 the formulas 


(u5)’ = O.1u6 — 0.01656? + 0.00329175u55 — - - + 
6/2 = 0.015? — 0.00082554 ++ ---, 

(u5)’ = 0.001n5* — 0.0002475u55 + ---, 
64 = 0.000154 — -- - 


? 


(ud5)’ = 0.00001p85 — - - - 


? 


when differences of order greater than five are neglected, and use these formulas to sub- 
tabulate the data of Prob. 6 for x = 0.90(0.02)1.10. 

18. Suppose that mean values of f(x) are known over each of the ranges (a, — h/2, 
w +h/2) (k =0, 1, 2, ...), where ze41 — 2 =h, and that approximate mean 


values over ranges of length 2ph, again centered about the points z;, are required. 
With the notations 


1 fzeth/2 


, 1 xrktph 
me = 5 Blan f(x) dz, m, = 


2ph tk —ph f(z) dx, 


derive the operational relation 


' 1 sinh phD 


+ 20 sinh 3hD ”” 


and deduce the formula 
Ye _—1 a _ yas tt ono ayes ws 
m™m, 1 3! & p )6 + 5! (< Pp VG p 6 Mr. 


(Nee Prob. 32 of Chap. 4, with z = p,a = hD, and 8 = 6.) In particular, deduce the 


formula 
12/6\?2, 12+32 /6\4 
n=([1-3(3) ar rae (5) ae ++] ma, 


19. The Bernoulli polynomial B;(x), of kth degree, is defined as the coefficient of 


Nection 5.8 


w'/kl in the expansion 
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(a) By differentiating the equal members of this relation, deduce the differential 
recurrence formula 
By (a) = kBi-1(z) (k = 1,.2,. 22), 
and show also that Bo(x) = 1. 
(b) By making use of the identity 


(—u)e™* — uel*)4 
e*—-1  e-—l 


prove that 
Bi — 2) = (—1)*B:(z). 


Also, by integrating the equal members of the defining relation over (0,1), deduce that 


fp Buia) ae = 0 (k > 0) 


and use this result, together with the recurrence formula of (a), to show that 


Bo(x) = 1, Biz) =x —- oa B(x) = 2? — 2% +% B(x) = «8 — 3a? + 5% 
and so forth. 


(c) In accordance with (5.8.7), the kth Bernoull number B, is defined by the relation 
B. = B,(0). Show that 


a a re 
Rope tgs 


is an even function of u, and hence deduce that Bi = —¥ and that Bony: = 0 when 
m2. 

(d) Use the identity 
u/2 u 


uer/? 
ev?—1 e—l 


e—1 








to deduce that 
Bid) = (2-* — Be. 


20. Use appropriate results of Prob. 19 to show that Bam4s(2) vanishes when z= 
}, and 1. Show also that, if it vanishes at any point inside (0,1) in addition to z= 
then it must vanish at at least two such points. Then deduce that this situation 
impossible by using Rolle’s theorem to show that its existence would imply t 
Bom y(t) = (2m + 1)Bim(x) vanishes at least four times inside (0,1), that Bom-1( 
vanishes at at least two points inside (0,1), in addition to x =, and hence t 
Bom_3(z), . . . , Bs(z) have the same property, thus establishing a contradiction sii 
B3(x) = x(x — a(x —1). Show further that the function Bam+2(x) - Bom+2(x) 
Bom42 vanishes at the ends of the interval (0,1), and that its vanishing anywhi 
inside (0,1) would contradict the preceding result. Hence deduce that the Jur 
Boma2(t) = Bamy2(t) — Bomy2 vanishes at c = 0 and at t= 1, is of constant sign 
(0,1), and takes on its extreme value in that interval at x = 4. 

21. Use successive integrations by parts to show that 


iF [Bamsa(®) — Bamy2|F2"*?)(s) ds = [{Bam42(8) — Bamys)FO"*” (8) 


— Biya (8)FO™ (8) + + + +b Bia (s)F'(8) = Bie (eFC) 
+ ff} Bix? ore) 
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Then by using results of Prob. 19, deduce the formula 


Br 
(2i)! 





1 1 WS 7 : 
AF) + FO] = a F(s) ds + > [FO-v (1) — FO“ (0)] + E, 
i=l 


where 


eal ea if [Bom42(8) — Bams2]FO"*?)(s) ds. 


22. By summing the results of increasing the argument of F successively by 0, 1, 
2, ..., andr — 1, in Prob. 21, obtain the formula 


Bai 
(22)! 





y F(k) = [ro ds + 30F0) + FW) + > [FaHn(r) — F@-0(0)] + En(r), 
k=0 t=1 


where 


r—1 
1 
Em(t) = = Ge gy fy Bamsal6) — Bansal 2, Femen(s +k) | ds 


23. Show that the error term in Prob. 22 can be written in the form 


F@m+2) 1 Bom 
—r oma hh [Bom+2(s) — Bomy2] ds = r Om ED! FQm+2)(£) 


E,(r) = 
for some £ such that 0 < & < 1, if F°+ is continuous in that interval, and also that, 


if F2™+2)(s) does not change sign for 0 < s <r, the error term can be expressed in the 
form 


r—1 
En(r) = — Bowral) Sees 2. [Fem+n(k + 1) — F@m+0(k)] 
=e Bono oe [Fem+n (7) — FOm+)(0)], 


for some 7 such that 0 <7 <1. Further, use the results of Probs. 20 and 19(d) 
(o show that this term is numerically smaller than twice the first term neglected 
in the expansion of Prob. 22 and is of the same sign. [Notice that this expansion 
is reduced to that of (5.8.13) if F(s) is identified with f(zo + hs), with the substitution 
fo ths = 2] 

24. Show that, if the first N — 1 terms of the series 


f253 1 ; 
Seltgtat-:- +at---: =F = 1.644934 


are summed directly, and if the Euler-Maclaurin sum formula is used to approximate 
the remainder, there follows 


u 1 1 
We [l+i+p+s-- +qtpl 


1 1 1 1 1 Bom 
+[y tans tam ~ som tam ~ °° tytn] + Bac. 


Then determine N and m in such a way that the number of terms to be retained is 
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minimized, assuming successively that approximations which round correctly to 5, 
10, and 20 decimal places are required. 


. 
25. Suppose that neither y Fw nor is " F(s) ds necessarily converges as r— ©, 
0 


but that their difference tends to a limit C, so that 


C = lim [y F(k) — [ire as |, 
ro? k=0 


and that F(s) and all its derivatives tend to zero as s—> ©. Show that the Euler- 
Maclaurin expansion of Prob. 22 then can be written in the form 


’, F(k) = i "F(s)ds + C + 3F(r) + y oi Fenn () + Bn(r), 
k=0 





i=l 
where ; 
E,(r) = En(r) — Em(©) 


=o f BicssGV = Buccs [ >, Fanta (gs + i) | ds, 
(2m ! Jo ot 


and also obtain results analogous to those of Prob. 23 in this case. Show further t 


m 
CO = 1F(0) — oi FO-D(0) + En(~), 
p= 1 


F 
\ -B a 
where Bq (oo) = Barssta) — Semis pantn(o) <9 <1), 


if F@™+2)(s) does not change sign for0 <s < ©. ; ; 
26. Use the result of Prob. 25 to deduce the asymptotic expansion 





OT LE 1 Na hens Bom 
Ldig tp ed EE doen + Ook ge T2n2 @mynim + 
where , C= lim (> k — log n), 

ee ped 


assuming the existence of this limit. Also show that 


1 1 1 Bam 


C=5ti53-p0t°*’ + 2m, +2 
Bam n) — Bom 
where E, = -—eteS“Ss os ae ag (9<n <1), 


and that Em is of the same sign as the first neglected term and is less than twice 
large. Finally, determine the best approximation to C obtainable from this expai 
and determine C to five places by equating the two members of the former expai 
when n = 10. (The constant C involved here is known as Buler’s constant and 
known to round to 0,5772156649,) 
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27. Use an appropriate modification of the result of Prob. 25 to deduce the asymp- 
totic expansion 


log n! =log1+log2+--- +logn 
1 1 Bom 
=(n +9) logn+K—n+ 95, - 399 + °° + OmGm — Da 
foie, 
where K = lim flog n! — (n +4) logn 4+ nl, 
n> © 


assuming the existence of this limit, and show that 


1 1 Bom 


K =1-%5 +360 ~ °'° ~ 2m@m—1) 


+ En, 

where E,, is of the same sign as the first neglected term and is less than twice as large. 
Also, calculate an approximate value of K from this expansion, and determine K to 
five places by setting n = 10 in the former one and using the fact that log 10! = 
15.104412. The true value of K is known to be ¥ log 27 = 0.91894. Assuming this 
fact, deduce Stirling’s asymptotic formula for the factorial, in the form 


orn non 1 1 _— 
n! = V/2en ne (14a + ose ve 
Section 5.9 
28. Deduce the Gauss summation formula by writing 6%, = 6°~'p, — 6~1po in 
the formula of Prob. 13 and summing the resultant equal members for k = 0, 1, 


whey ta ds 
29. Show that 


[2 p(@) ae — pe + ps + +p) = PE gl — BVT — 1 BMY 


iad, Seles 
aa a | 





and, by expressing the operator affecting po in terms of A and that affecting p, in terms 
of V, deduce the Gregory summation formula in the operational form 








1 r A) —1 -v) -1 
h is p(a) dx = (po tpit +++ + prit pr) — of d. Po — aoeee Pry 
where o(u) = imate = > ck, 

k=0 


with the notation of (5.4.4). 
80. Use the data given in Prob. 27 of Chap. 3 to obtain approximate values of the 


integral 
Vi [pow 
r JO 


hy means of the Euler-Maclaurin, Gregory, and Gauss formulas. 
81. Apply both the Gregory and Gauss formulas to the approximate summation 
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of the series in the form 


* G(s) = —7e(Q1 — Isl)*(1 + 3is|) —(Is] S DL 
y k and show that 
Qk +13 1 
ei [2 GF) as = —eor(® (le <0. 
to five places, after summing an appropriate number of terms in advance. 
32. Use each of the formulas (5.9.9) to (5.9.11) to sum the series 


y (=) 
4 
k=1 


to five places, after summing an appropriate number of terms in advance. 
33. Determine the Euler sum of each of the following divergent series: 


Also, by writing = (f — ao — h)/h and F(x) = f(t), deduce Simpson’s rule in the 
form (3.5.11). 
39. Apply integration by parts to the result of Prob. 38, to show that the error in 


Simpson’s rule, as applied to F(z) over the interval (—1,1), can be expressed in the 
alternative forms 





R 


—1 a (1 — |s|)3(1 + 3]s|)Fiv(s) ds 


1 
—3 [9 = lobF"G) ds 


1—-14+1—-14+---4+(-)D*4+--:- 1 
ee ent = [', a = [sha = 3ispr'G) ds 
()1-244-84+--- + (-1rt--- 


and deduce that, when the rule is applied to f(z) over an interval (xo, zo + 2h), there 


Also verify that the three series can be obtained formally by setting x = 1 in the follows 


power-series expansions of (1 + )~!, (1 + x)~*, and (1 + 2z)7', respectively, and 
that the Euler sum in each case is the value taken on by the generating function when 


5 4 3 
isk, iRiseM, iis om, 
a=1. 


90 36 


é where M;, is the maximum value of |f (x)| in (xo, xo + 2h), under the assumption that 
Section 5.10 {‘(x) exists and is integrable over that interval. 

40. Determine Wo, W:, and W2, as functions of a, in such a way that the error term 
in the formula 


34. Calculate the approximate value of 


L7 
sin x dx 
0.5 


from the data of Prob. 6, using Weddle’s rule and Hardy’s rule. 
35. Calculate five approximate values of 


if 6 dx 
o 1 +2? 
using the Newton-Cotes seven-point formula, the formulas of Weddle and Hardy, an 


the parabolic and trapezoidal rules, with h = 1, and compare the results with the ti 
value tan7! 6 = 1.406. 


[2 P@ ae = WoF(-a) + WFO) + WFQ) +R <a 81) 


vanishes when F(z) is an arbitrary polynomial of degree three or less, showing that 
the resultant formula is of the form 


[2 P@ dz = F(a) + 2G! - 1)FO) + F@)] +R 


and that its degree of precision is three unlessa = 4/3, and is five in that case. Also, 


show that the influence function corresponding to N = 3 is given by 
Section 5.11 


F ‘ : Lia 4 1 3 < 
36. Show that (z — s)" is a continuous function of x and s if n > 0, and that G(s) = ga(l — |s|)* — T8a2 (@ — |s|) (Is| S$ @), 


1 
1 — |s|)4 < = 1). 
i Si ie (cz — g)ntt b ( i 2 shai jou aa( |) (a Ss \s| ) 
a Tt carrer er il n A 1 ag — Bly Re 8) ae (Compare Prob. 45.) 
41. Show that the function G(s) obtained in Prob. 40 does not change sign in (—1,1) 


37. Derive (5.11.17) from (5.11.16) and, under the assumption that the degree when a = nD and deduce the formula 


precision of (5.11.1) is at least N, show also that G(s) vanishes when s is outside (A,B) 
where A and B are the smallest and largest of xo, 1, . . - » Zn, @, and b. 


1 
F(a) dx = §[2F(—y) — F(0) + 2F(% 7 oFiv 
38. Obtain the influence function G(s) for which i -1 (2) dz = gl2K(—y) (0) + 2F(y)] + rao" (2), 


where |g| <1, Also transform this result to the Newton-Cotes three-point formula 


[LP de = §UP(-1) + APO) + FA) + [arm ds, 4.10), of open. type, 
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42. Show that the degree of precision of the formula where V, V’, and V” vanish for z= +1,show that then V(z) is nonpositive in (—1,1), 


and deduce the formula 





1 
[2 F@ ae = delta) + 16F) + 7F(—1)] ~ Yel") — F-D +B ‘ ; 
[2 P@ az = Hor(— VB + 8F 0) + 5FVB1 + rotsoF"® — (lel <0. 
is five, obtain the influence function relative to N = 5 in the form 

Section 5.12 
G(s) = xe00(1 — Is|)4(1 + 4s] + 5s?), 47. By specializing the Newton-Cotes four-point formula of open type to the 

and deduce that : interval (—2,3) with h = 1, in the form 
R = gras (é) (lé]| < 1). 4 
F(a) de = 2[11F(— : 2 

Also, generalize this result by writing x = (t — xo — h)/h and F(z) = f(é). ie (2) galllF(—1) + FO) + FA) + MFO) + i= m(x)F[—1,0,1,2,2] da, 


43. Show that the degree of precision of the formula 
& p where r(x) = x(x? — 1)(a — 2), and considering the function 


a@ = f°, 75,4, 


F(1) — 2F(0) + F(-1) = YglF’"() + 10F"0) + F'(-DI +B 





is five, and that R can be expressed in the form 
show that the error can be expressed in the form 


Fiv(&,) [2 x(a) de 4 PX) 


R= shy [7 = [lst ~ 6s] — 2G) de = —stoF®) (el <0. 
3 
4! 2 Al 2 a(x) dx = Pri (£), 


E= 
44. Assuming that zo S x S 21, obtain g(z,s) such that 
fle) —F(er) ick where &, 2, and é are inside the interval (—2,3). 
f(a) = f(x0) + (@ — 20) cy ae + [ : g(x,s)f’’(s) ds (%1 — Zo = h) 48. Determine Wi, We, and W; such that the formula 


2 
ee ee fy 2F@) de = WFO) + Wak) + WFQ) +R 


hg(2,8) = —(s — £o)(x1 — 2) (co Ss $2), ‘ 
GX, —(x — 20) (a1 — 8) (« Ss 3%), possesses a degree of precision of at least two, and show that the resultant formula 
tnkes the form 
and deduce the more familiar form of the error term, : 

F(x) dx = $[2F(1) + F(2)|— a5F” 0 se 
R = F(a — 20)(x — a1)f""(E) (to < — < 21). , eF"(8) LESH 







‘ 49. Deri 
45. Show that the error term relevant to the formula of Prob. 40 can be written erere shetormule 


the form 





1 F(z) T wr o. 
is waa = gF(-) + 2F(0) + F(1)] — To9 FP" () (lt} <1). 


1 
Re is V'(a)F[—a,0,a,2] dz, 
60. Show that the error R in Prob. 42 can be written in the form 


where V(x) = 4[(x? — a)? — (1 — a?)?], and deduce that 1 
R= / x2(1 — 2)?F[—1,—1,0,0,1,1,z] dz, 
3 — 5a? |, =i 

R=—z_ Fiv(é) (lé| < 1) 


180 wad that this form leads again to the result 


R=a7ssF"(t) — (|é| < 1). 


61. Show that the error R relevant to the Newton-Cotes five-point formula of 
vloned type, as applied to F(x) over (—2,2), can be expressed in the form 


when 0 < a? $+ or a?=1. Also show that this result reduces to the results 
Probs. 38 and 41 when a = 1 and 3, respectively, and, by determining « such that 
weighting coefficients are equal, deduce the additional formula 


[i p@a =3[F(-Y) +o +P (2) ] + store, 


46. Determine a such that the error term relevant to the formula of Prob. 40 
be written in the form 


2 
se / © V@F(-2,—1,0,1,22] da, 
where Viz) = / “Me — 1 — 4) at. 


Mhow also that V(x) is an even function, so that V(—x) = V(x), and that V(2) = 0. 


1 
2s “e as 
R if 18 V'''(@)F[ —a,0,e,2] da, Nhow further that V(«) increases to a positive maximum value as x increases from —2 
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to —1, that it then decreases steadily as x increases from —1 to 0, and that 


-1 
Vo) = V(-1) + fo ue —1@—4)d =V(-) -{5 gt 
38+7 
-(.-322) 0-0 


Hence deduce that V(0) is positive, that V(x) does not change sign in (—2,2), and 
therefore that 


[e(2 — 1)(2 — 4)] at 





(-2<7<—1). 





B= [2 tat — Det — 4) de = —ytsFK© 


2). 
= (él < 2) 


(A similar analysis, due to Steffensen [18], applies to all Newton-Cotes formulas, of 
closed type, employing an odd number of ordinates.) 

52. For the Newton-Cotes six-point formula of closed type, as applied to F(x) over 
(—2,3), show that the function V(x) of Prob. 51 serves as an appropriate Q function 
over (—2,2), so that the error R can be expressed in the form 


vi ‘vi 3 
ae Prey * (at — 1) (2? — 4)(e — 3) de + G2 [* acer — 1) (@? — 4)(e — 3) de 
= —gigFi(ts) — ohbhoF) = —raabeF() — (-2 < E <3). 


Supplementary Problems 


53. By a double application of Simpson’s rule, derive the formula 


i ‘ J ” f(a,y) dx dy = a [(fo.o + for + fao + fe) + 4(for + fro + fra + four) 
pet + 16f1.1] + B, 
where at, = 20 + rh, ys = yo + 8k, and fre = f(XrYs)s and show that 


= hk [ *f (E1,m1) + kA oem), 


B= - ax! ay 





where £1, £2 lie in (0,22) and 71, 72 in (yo,y2). [More elaborate formulas for two-way 
integration over a rectangle (‘‘cubature formulas”) are obtainable by double applic 
tion of other one-dimensional integration formulas.] 

54. By applying the formula of Prob. 53 to subrectangles, and adding the results, 
derive the two-dimensional generalization of the parabolic rule in the form 


is [Pte dx dy = AE (fo. + 4fio + 2feo + > + > +fm.0) 
‘J xo yo 


+ (for + Afar + Sfaa to tft) + 2a + 4a + fea + + Sma) 
Foa : : +e: rf (fon + 4fisn + fon torres + fmn)] + B, 


hk of (EA a*f (Es,M2) 
me [ ma Put) + ake ar } 





where E= ‘ 
when m and n are even integers. 

55. Let E,, E,, Az, Ay, and so forth, designate operators which affect only 
variable indicated by the subscript, with uniform spacings h and k implied in the 
and y directions, respectively, so that, for example, 63 fo,0 ™ 1,0 — 2fo0 + Ju1.0, Wh 
fre = f(ao + rh, yo + 8k), By writing 


Jon = BLE, Soo 
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and referring to the interpolation formulas of Newton, Stirling, Bessel, and Everett, 
deduce that a variety of two-dimensional interpolation formulas can be obtained by 
substituting one of the following indicated expansions for each operator, and trun- 
cating the result: 


E? 


1+ pat PO Dary... =itpy + PO tet... 


+m +Brg = [at — 0590 + PED por + |B 
=[a-m -P@=Be Day... | + [peas ... |p 


Which pairs of expansions would be appropriate for interpolation near corners of a 
table? Near the borders? At interior points? 


56. By using the Newton forward-difference expansion in both directions in Prob. 
F 


55, and retaining only differences through the first in each direction, deduce the 
upproximate formula 


furs 


2 


ql + rAz)(1 + sAy)fo,o 
( —r)(1 — s)foo + rl — s)fio + sl — r)for + rsfin 


and show that this formula would yield exact results if f(z,y) were of the form A + 
Br + Cy + Dzy. Also obtain the formula which neglects the mixed second differ- 
once AzAyfo,o, Show that it would yield exact results if f were of the form A + Bx + 
('y, and specialize both formulas when r = s = }. 

57. By using the Everett expansion in both directions in Prob. 55, and neglecting 
differences and mixed differences of order greater than three, deduce the approximate 
formula 


r(2 —1r) s(2 — 8) 
6 52 6 a| Soyo 





foe = (1 —r)(1 — 8) [1 _ 








1—?r Q2-— 
+ ras) [1 - = a — 80 2 65 | fue 
2- — 92 
+ -ns[1-E4 a i | fon 
— r2 — g2 
+re[1 at gl a 05 | fo 


Mhow also that it would yield exact results for 


fey) = A + Bix + Boy + Cit? + Cory + Coy? + Diz? + Dox?y + Dry? 
+ Dy + E,x5y + Foxy’, 
wnd specialize the formula when r = s = + 
68. A table includes the following ordinates and differences, together with a state- 
mont that differences of order four or greater are negligible. Use the formula of 
rob. 57 to interpolate for f(6.55, 1.05) and for f(6.524,1.042). 














y = 1.0 y= 11 
c S(a,y) 5 by S(a,y) 5 by 
6.5 0,9989623 —168 —31 0.9989783 —171 —28 
6.6 0,9990866 —147 —28 0.9991026 —-150 —26 








CHAPTER 6 


NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 


6.1. Introduction. Many techniques are available for the approx- 
imate solution of ordinary differential equations, or of sets of such equa- 
tions, by numerical methods. This chapter presents a selection of fre- 
quently used procedures of various types and illustrates their application. 
In addition, an indication is given of the troublesome problem of error 
propagation in stepwise integration processes, and over-all error bounds 
are obtained in illustrative cases. 

Some comments relative to the problem of selecting an appropriate 
technique are included in the concluding section (§6.19). 

Whereas most of the treatments deal with initial-value problems, brief 
considerations of boundary-value problems (§6.17) and characteristic- 
value problems (§6.18) are also included. 

6.2. Formulas of Open Type. We consider first the problem in whic 
it is desired to obtain a numerical approximate solution of the first-orde 
equation 


dy _ 
which takes on a prescribed value yo when x = 2o, 


y(%o) = Yo. 


Starting with the known ordinate, it is proposed to calculate successivel, 
the ordinates 


Il 


Yr = y(ao + h) = y(a1), Y2 
Yn = y(Xo + nh) 


where h is a suitably chosen spacing. 
For this purpose, we may, in particular, make use of the relation 


y(xo + 2h) = y(x2), 


ed 4 te (6.2.3 


I 


Yui = Yn t ES y' (x) dx. (6.2. 


Suppose that the ordinates yn, Yn—1, » + + » Yi, and yo are known. Th 
the corresponding values of y’(x) are calculable from the formula 


Yio ™ Y' (ae) = F(anyyn)- (6.2. 
188 
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If we approximate y’(x) by the polynomial of degree N which takes on 
the calculated values at the N + 1 points tn, tn-1, . . . , and %a_n, by 
making use of the Newton backward-difference formula (4.3.8), 





ts us + 99, + ELD gry, + bx 
4 s(s +1) - a +N -—1) V¥y! (6.2.6) 
where s= Sea (6.2.7) 


we may use this polynomial to extrapolate y’(x) over the interval (xn, 


tn +h), for the purpose of approximately effecting the integration indi- 
cated in (6.2.4). 


The result of this calculation is 


N 
1 
mth Yn+sd8 ~ Yn + ny, a V'Yn 
k=0 


1 
where ax - { akssp 1) otk i) ds, 


the leading terms of (6.2.8) being of the form 


Ynt1 (6.2.8) 





(6.2.9) 


Ynur © Yn t+ h(L + aV + eV? + BV? + 7a0V + a'eeVo + + + Uns 


(6.2.10) 
in accordance with (5.4.13). 


The error term corresponding to truncation with the Nth difference of 


y', is given by h times the integral of the right-hand member of (4.3.9) 
with f = y’, in the form 


_ ower [808 +1) ++ - (8 +N) 
B= we [ (V+ i)! 


or, since the coefficient of y+” does not change sign in (0,1), 


E = ayysh¥t2y%+ (£), 





yX*)(n) ds, 


(6.2.11) 
where 2n41 > £ > 2n_n. Thus, for example, if only third differences are 
retained, the error is given by 73gh®y"(é) where tny1 > & > tn—s. 

More generally, we may use (6.2.6) in the relation 


1 
Yost = oe th fo vase ds, (6.2.12) 


where p is any positive integer, to express the ordinate following the nth 
one in terms of the ordinate calculated p steps previously and in terms of, 
wy, N +1 already calculated values of y’. The formulas most fre- 
quently used, in addition to (6.2.10) with p = 0, are those for which 
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p = 1,3, and 5, the leading terms of which are of the form 
Yer © Yaa + A(Z + OV + ZV? + BV? + Z8Vi + ZEVS + Yn 


(6.2.13) 

Yn © Yas + h(4 — 40+ BV? + OV? + FEV! + ZBVo + Yes 
(6.2.14) 

and 

Yast © Yn-s + A(B — 12V + 15V2 — OV? + F8V4 + OVS + - - -)¥A 
(6.2.15) 


Whereas the error associated with terminating one of these formulas 
with the Nth difference can be expressed in the form 





1 
s(s +1) --- (8 +N) 
E = hN*? (N+2)() ds, 6.2.16 
[B® pr as, 6218) 
where 7 lies between 2,41 and the smaller of zn» and x,_w, the fact that 
the coefficient of y“V+ changes sign in the integration range when p > 0 
makes it impossible to apply the law of the mean directly in order to 


obtain a simple form similar to (6.2.11). Somewhat more complicated 


forms are obtainable by subdividing the range of integration and applying 
the law of the mean to each subinterval, or, better, by using one of the 
methods of §§5.11 and 5.12. 

The formulas for which p is an odd integer are of particular interest 
because of the fact that, in each such formula, the coefficient of the pth 
difference is found to be zero. In these cases, the retention of p — 1 
differences thus affords the same accuracy as the retention of p differences. 
Indeed, the cases in which N = p correspond to the use of Newton-Cotes 
formulas of the open type, employing an odd number of ordinates, in the 
integration indicated in (6.2.12). Further, the error terms in those cases 
can be expressed in a form similar to (6.2.11) and are given for p = 3 and 


p = 5 in Eqs. (3.5.19) and (3.5.21). Thus, in particular, we have the 


special formulas 








, h§ ms 
Ynt1 = Yn-1 + 2hy;, + 3 y ! (é), (6.2.17) 
r , 202,,! 14h> v 
Yntl = Yn-3 + 4h(y', = Vn + 3V Yn) + 45 y (é), (6.2.18) 
and 
, , 5072,,/ 3 Bes! 1ies<,,! 41h? ii 
Yn41 = Yn + 6h(y', ed 2VYn + 3V Yn 3 Vv Yn + zoV Yn) =f “140 y” (é), 
(6.2.19) 


where, in each case, & lies between the largest and smallest of the argu- 
ments involved in that formula. These formulas, and corresponding ones 
for p = 7,9, .... have the property that, in each case, the retention 
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of differences through the Nth leads to a formula with accuracy “of 
order N + 2,” that is, to an error term proportional to h¥**, whereas 
for the other formulas of the type considered here the accuracy corre- 
sponding to the retention of differences through the Nth is of order 
N+1.7 

It is clear that, since a formula employing Nth differences depends 
upon knowledge of N + 1 successive values of yz, and since initially 
only yo is known, such a formula cannot be used until N additional ordi- 
nates have been determined by another method. 

Before illustrating the use of such formulas, it is desirable to consider 
a class of related formulas. 

6.3. Formulas of Closed Type. The formulas derived in the preceding 
section express p41 in terms only of previously calculated ordinates and 
slopes. A set of similar formulas which involve also the unknown slope 
y41 is obtained by replacing the right-hand member of (6.2.6) by the 
interpolation polynomial agreeing with y’(a#) at @n4yi, Un, - . » » Un—n41! 


; ’ s — l)s 
Ynts ~ Yer + (s — 1) V¥nas + = Vn te 


N! 





V¥yn41, (6.3.1) 


where s is again defined by (6.2.7). If this approximation is introduced 
into (6.2.12), the results in the cases p = 0, 1, 3, and 5 are obtained in the 
forms 


Ung © Yn + ACL — 3V — teV? — oeV? — ee0V! — re0V® — + + Yat 
(6.3.2) 

Ung © Yn + A(2 — 2V + gV? + OV? — goV4 — woV) — + + Une 
(6.3.3) 

Wnti © Yn—s + A(4 — 8V + Sov? — $vF + 28V4 — OV) — - = *)Ynta, 
(6.3.4) 


and 


Unit © Yn—s + A(6 — 18V + 27V?2 — 24V3 + ARBV! — FRVE + Oe Yn 
(6.3.5) 


The error associated with retaining only Nth differences in a formula 
relating Yn4z1 2nd Yn—» can be expressed in the form 


1 _— . . . — 
We= nee [.& Lets oN + athe +N —1) yoy) ds, (6.3.6) 


} It is seen. that the terminology here is also such that a formula with “‘accuracy of 
order m” would yield exact results if the required solution y(x) were a polynomial of 
degree m or less, When y(x) is not such a function, it is not necessarily true that an 
inerease in m corresponds to an improvement in the approximation afforded, as was 
woon in §3.7, 
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where 7 lies between 2n41 and the smaller of tn—-p and tp_y. When 
p = 0, the law of the mean can be used, as in the preceding section, to 
show that the error is expressible in the form (6.2.11), where ay41 is the 
numerical coefficient in the first neglected term. In the cases (6.3.3) 
and (6.3.5), for which p is an odd integer, it is seen that retention of p + 1 
differences is equivalent to the retention of p + 2 differences and that 
the use of these special formulas corresponds to the use of Newton-Cotes 
formulas of the closed type, employing an odd number of ordinates, for 
which the error terms are obtainable from (3.5.11) and (3.5.13) when 
p = land p = 3. Thus we have the special formulas 


, hs 
Ynti = Yaa + Qh(yhar — VYnar + eV 7Ynt1) — 90 y(é); (6.3.7) 
and 
Ynti = Yn—3 + AMY nya — 2V Yn + BV'Ynt1 — BV 'Yrti + SOV Ynt1) 


8h. 
a> 945 y“*(é), (6.3.8) 





for which the retention of Nth differences yields an accuracy of order 
N + 2, whereas the other formulas of the type considered generally yield 
(N + 1)th-order accuracy. 

Formulas of the sort derived in this section may be said to be of closed 
type, since the expressions for the required ordinate yn4i, at the point 
In+1, involve the unknown slope y;,,; at that point, whereas those of the 


preceding section involve only known slopes at preceding points and are 


accordingly said to be of open type. A comparison of corresponding 
formulas employing a like number of differences shows that the error 
terms associated with formulas of closed type possess smaller numerical 
coefficients. However, since the unknown yny1 is involved (explicitly 
and implicitly) in both members of formulas of closed type, this advantage 
must be weighed against the fact that, unless y’ = F(z,y) is a linear 
function of y, the equation relevant to such a formula generally must be 
solved for yn4i by iterative methods. 

6.4. Start of Solution. Except for the special formula 


h? at 
Yai = Yn t+ hyn + 9 y’ (&), (6.4.1) 


obtained by omitting all differences in (6.2.10), and a similar formula of 
closed type, obtained from (6.3.2), each of the formulas obtained in th 
preceding sections can be applied only after the calculation of a numb 

of ordinates 41, y2, + + + , Yr equal to the order of the highest differen 

retained in that formula, in addition to the prescribed ordinate yo. 
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One method of starting the solution of the problem 


d 
= Flay), — y(ae) = yo (6.4.2) 


consists in determining the coefficients of a Taylor expansion 





Ys = y(Lo + hs) = yo + Me + HH Bhs 4 BUD 
: 2! mt 
hr+ty C+D (£) 
“@ppr wt (4.3) 


where y = (d'y/dx*) 2, and ro < — < a + hs, by successively differ- 
entiating the basic differential equation or otherwise, under the assump- 
tion that a convergent expansion of this type exists, when s is sufficiently 
small. 


Thus, recalling that d/dz = 0/dx + y’ 0/dy, we obtain the relations 
= y = F(xy), oy” = F.(a,y) + y'F,(2,y), 
yf" = Fy2(x,y) + 2y/Fay (x,y) + y Fry(a,y) + y”’F,y(2,y), 
and so forth, and hence there follows 
Yo = F(xo,yo), Yo. = Fx(%o,Yo) + Yol',(Xo,Yo), (6.4.4) 
and so forth. 
Whereas these general expressions become quite involved as the order of 


the required derivative increases, they are not actually needed in prac- 
tice. In order to illustrate this fact, we consider the specific example 


dx = 7 — Y; y(0) =. 1, (6.4.5) 
for which the exact solution is readily found to be 

y = 2 —2e+ 22 — e*. (6.4.6) 
‘rom the given equation, we obtain successively 


yet-y yYoe2e-y, yw 2—y, yoy”, 


y= —-y", ++ (6.4.7) 
and hence, with ao = 0, there follows 
= 1, Yo = —l, Yo a 1, yy" = 1, 

Yo Mav-owis y = 1, yo = —=1) 


Thus, if we take h = 5, Eq. (6.4.3) gives 


Lidsa\ aid Se di ‘ ™ 

. op as 3 (3) (3) - alt ifs 
v io * 2\i0) + 6\io) ~ 24\i0) * i20 \i0 
1 a\r 


— 555 5) + +++ (6.4.8) 
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and, with s = 1, 2, and 3, we obtain 
yi = 0.90516, yo = 0.82127, ys = 0.74918, (6.4.9) 


to five places. Since the successive terms of (6.4.8) alternate in sign from 
the fourth term onward, and decrease steadily in magnitude, the error 
due to truncation is smaller in magnitude than the first neglected term 
and is of the same sign. Additional ordinates could be obtained to this 
accuracy by retaining sufficiently many terms in the expansion. Alter- 
natively, a new expansion could be launched from the point xs, in the 
form 





h , heyy! [2 
You = ya tet ke 
with ys known and 3, y3, - - - calculable in terms of y; from (6.4.7). 


It is obvious that the linear example chosen illustrates a particularly 
simple case, because of the simplicity of the relations (6.4.7) and because 
of the fact that (6.4.8) is an alternating series and hence is amenable to a 
precise truncation-error analysis. More usually, the relations (6.4.7) 
are replaced by successive equations which increase fairly rapidly in 
complexity, so that it is usually desirable to abandon this procedure in 
favor of a more convenient one when sufficiently many starting values 
have been obtained. Furthermore, since y(x) itself is not known, it is 
usually difficult to obtain a reliable estimate of the truncation-error 
term given in (6.4.3), even though it be known that the series itself is 
convergent for the value of s under consideration. 

Discussion of the convergence (and existence) of (6.4.3) in the general 
case of the problem (6.4.2), as well as consideration of other types of 
expansions which can be used when (6.4.3) cannot, must be omitted here 
(see Ince [120]). In some cases it is preferable to determine the coefhi- 
cients in an assumed expansion of the form 


) => Ax(a — Xo)* 
y(x ae 


by inserting that expansion in the differential equation and obtaining @ 
recurrence formula to be satisfied by the A’s. 

A similar method, which has the advantage that the order of the high- 
est derivative required is about half that needed in (6.4.3), but has the 


disadvantage that each forward step involves an iterative process, 18 


treated in $6.14. 
Mention should also be made of Picard’s method, in which the problem 


(6.4.2) is first transformed into the integral equation 


u(x) = yo [" (aye) de, 
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and successive functions approximating y(x) near x = xo are generated 
by the iteration 


yh (2) = yo + fe F(x,y"l(x)) de. (6.4.10) 


The initial approximation y"! (x) is conveniently taken to be the constant 
yo or the linear function yo + yj(a — 20), where yj is determined from the 
differential equation. 


Thus, in the preceding example, we would write 


y*(2) = 1+ I, [22 — y"l(x)] dx (6.4.11) 
and, with y! (2) = 1, there would then follow 
ye) = Lat get, ye) = 1 — a + det t dat — oye, 
(6.4.12) 
and so forth. The accuracy afforded by a member of the sequence of 
approximations at a certain number of points 2, 2, ... could be 


estimated by comparing calculated values at those points with values 
calculated from the preceding approximation, or by use of appropriate 
analytical methods. 

While Picard’s method is of great theoretical importance, the explicit 
evaluation of the integral in (6.4.10) is often impracticable in cases 
which are less simple than the preceding one. Thus, for the problem 
y’ = cos (« + y), y(O) = 1, the first iteration with y!l(z) = 1 gives 
y4(¢) = 1—sinl+sin(#+1), and the second iteration would 
involve the evaluation of the form 


yA(a) = 1+ i cos [1 — sin 1 + « + sin (a + 1)] dz. 


Also, when F(z,y) is not given analytically, neither this procedure nor 
the power-series method is directly applicable. 

A frequently used class of procedures consists in evaluating the integral 
of y’ = F(a,y) in (6.4.10) approximately, by use of numerical methods. 
Thus, in particular, if y’ is approximated by the Newton forward- 
difference polynomial-interpolation formula, the results of the integra- 
tion are obtained by replacing p(x) by y’(x) in (5.4.8), and we have the 
formulas 


Yi = yo t hf + gd — ped? + ged? — zeypAt + - - ly, 

Y2 = yo + h[2 + 2A + 3A? + 0A — go At + - + ‘ly, 

ys = yo + A[3 + $A + fd? + gA* — geAt + - - ‘Jy, 

Vs = Yo + Ald + 8A + PA? + BA* + 5d* + - - lo, 
and so forth. Here yo is given, and if, say, yi, v2, ys, and y4 are estimated, 
the corresponding values of yj, . . . , y4 can be calculated from the differ- 


ll 


(6.4.13) 
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: : — 
i i i to the right-hand members to give 
tial equation and introduced in 

So oroeiatinns to Yi, +--+» Ys, after which the process may be 
i d. . _ = . 
aie case of (6.4.5) we may notice that, since the value yj - : is 
obtained from the differential equation, ae Lge sa in 
i i i h = 0. e fo - 

{0 =1—2, is appropriate. With eS 
ee yee be formed when three additional ordinates are 








incorporated: 
y y’ A A? As 
x 
ee Ae, ape tee ee 
00 —1.00000 
0.0 1.000 mr 
0 —0.89000 2000 
0.1 0.9000 18000 0 
2000 
80000 —0.76000 
0.2 0.8 15000 
0.3 0.70000 —0.61000 





The use of the first three of Eqs. (6.4.13), retaining third differences, 
leads to the new array 


(0000 | —1.00000 

0.0 | 1.0 le 
90533 | —0.89533 799 

sear bad 11266 —198 
_g2267 | —0.78267 601 

alas 11867 

0.3 | 0.75400 | —0.66400 


Three additional iterations yield results which are ruacanee hy 7 
‘teration and which are correct to those p _ T 
ae hecked, in practice, by considering 
tness of those values would be checked, p : 
ihe cficots of the neglected fourth differences, sample values of which 
become available as the calculation is advanced from this ego 
iati i ists in using centr 
A useful variation of this procedure consis ising 1 
and in determining ordinates on both sides of the initial point Xo. Thus, 
by appropriate integration of the 
the relations 
yea = yo M2 + 2ud — 43% + Gnd! — edt +” * Wo 


g, =i yo + All + ud + $8? — gand® — ra0d! + Yor 
ya = yo t A[2 + Qud + $8? + gud) + asd! T°” ‘lYo- 


Since here the calculated ordinates are taken as close to the Send 
i i iterative process 18 gen 

ssible, the convergence of the itera ( 
ome sun that associated with the use of (6.4.13) unless the solution 








Stirling interpolation formula, we obtain 


yo th—1 + dud — 36% — sud? + rand + °° Wo 6.4.14) 
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displays unfavorable characteristics to the left of the point 2 .t Since 
truncation with a mean odd difference would not correspond to true 
collocation, it is desirable to use a symmetrical array of abscissas. 

In the case of the preceding example, we may start with the array 














x y y’ 6 62 53 54 
—0.2 1.20000 —1.16000 
7000 
-—0.1 1.10000 —1.09000 2000 
9000 0 
0.0 1.00000 —1.00000 (10000) 2000 (0) O 
11000 0 
0.1 0.90000 —0.89000 2000 
13000 
0.2 0.80000 —0.76000 


when four additional ordinates are used. The mean odd central differ- 


ences are entered in parentheses. After four iterations, using (6.4.14), 
we obtain the array 





—0.2 1.21860 —1.17860 
8377 
—0.1 1.10483 —1.09483 1106 
9483 —105 
0.0 1.00000 —1.00000 1001 9 
10484 —96 
0.1 0.90516 —0.89516 905 
11389 
0.2 0.82127 —0.78127 








which is unchanged, to the five places retained, by further iteration. 
Thus five-place values of y_2, y_1, Yo, yi, and y2 are now available for the 
advancing calculation. Here the fact that the effect of the fourth differ- 
ences is negligible supplies fair evidence that sufficiently many ordinates 
were used. 

It is important to notice that the formulas of (6.4.18) or (6.4.14) can 
also be expressed explicitly in terms of the slopes yj, if the use of differ- 
ences is undesirable, once the number of slopes to be retained has been 
decided (the corresponding five-slope formulas are given in Milne [155]). 

Another class of self-starting methods, which are also useful when 
I'(x,y) is not defined analytically, but which are noniterative, is treated in 
§§$6.15 and 6.16. 

6.5. Methods Based on Open-type Formulas. Once at least N addi- 
tional ordinates, say yi, y2, -.., yn, are determined, the calculation 
may be continued by use of one of the formulas, derived in §6.2 or 6.3, 


| The calculation of ordinates on both sides of the starting point is also frequently 
convenient when use is made of power series, 
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which involves Nth differences. In the case of the example (6.4.5), with 
the calculated data (6.4.9), the preliminary tabulation may be arranged 
as in Table 6.1 where, for compactness, the backward difference V*y;, is 
written in the same line as the entry y;,. 








TABLE 6.1 
a y y’ vy" V2y’ Viy’ vty’ 
0.0 1.00000 —1.00000 
0.1 0.90516 —0.89516 10484 
0.2 0.82127 —0.78127 11389 905 
0.3 0.74918 —0.65918 12209 820 —85 


In particular, the Adams method uses formula (6.2.10), truncated to a 
suitable number of terms, for advancing the calculation. (The simplest 
such procedure, in which no differences are retained, is often known as 
Euler’s method.) Thus, if third differences are retained, the Adams 
method next yields 


ya ~ 0.74918 + zo{—0.65918 + $(0.12209) + yx(0.00820) — §(0.00085)] 
= 0.68968, 


after which an additional line 
(6.5.1) 


is entered for the purpose of advancing to ys. If again only third differ- 
ences are retained, the next line appears as follows: 


0.4 | 0.68968 | —0.52068 | 12950 741 —79 6 


13621 671 —70 9 (6.5.2) 


Since 


0.5 | 0.64347 | -—0.39347 | 


The fourth difference is carried along as a partial-check column. 
the truncation error in each step is of the form 


Froh®y’(é), 


for some é, and since h‘yy(é) is given by V‘y'(n), for some 7, the tw 
available sample values of V*y' indicate that h*y” probably does not va. 
strongly over the relevant range, so that a fairly dependable estimate 
the truncation error committed in each of the steps can be obtained by 
calculating the contribution 75h Viy;, of the first neglected difference, 
With h = 0.1, this contribution will amount to less than one-half unit i 
the fifth place if V4y!, does not exceed 14 units in that place. 
If use is made instead of formula (6.2.18), in which only second differs 
ences are retained, the same results are obtained. Here the arc es 
mate again depends upon the fourth difference, the factor {4 * 0.8 
replacing the factor #$} + 0.35 relevant to the Adams formula wi 
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third differences. Thus, as compared with the Adams method, this 
method here possesses the advantage that one less difference is needed in 
the calculation (but not in the error check) and that the coefficients in 
the formula are somewhat simpler. 

It should be emphasized that the errors so far considered are those 
which would arise in a single step from rp tO Xn41 if Yo, ¥1, - » Yn Were 
exactly correct and if no round-off errors were introduced in that step. 
In addition, however, one must consider the cumulative effect of the 
errors introduced in preceding steps. Whereas consideration of the prop- 
agation of errors is postponed to §$§6.7 and 6.8, it may be remarked here 
that the advantage in stability generally lies with the Adams method. 
This situation is related to the fact that the ordinates themselves are 
“loosely coupled” by (6.2.18), in that the ordinate y, is linked directly 
only with ordinates of the form y,_4;, where 7 is an integer, whereas in 
(6.2.10) all ordinates are directly linked together. 

6.6. Methods Based on Closed-type Formulas. The usual method of 
employing one of the formulas of §6.3 to calculate yn,1 consists in first 
estimating Yn41, calculating yi,, = F(an41,Yn41) and forming the requisite 
differences V‘y/,,, corresponding to this estimate, and then calculating an 
improved estimate of yn41 by use of the formula. The cycle is repeated, 
if necessary, until two successive estimates agree within the prescribed 
tolerance. The initial estimate, say y‘,, may be obtained by use of a 
formula of open type. 

Thus, returning to the example considered in the preceding section, 
line (6.5.1) can be considered as the result of using the Adams method, 
with third differences, as a ‘‘predictor.’”’ If now the data in this line are 


used in (6.3.2), truncated also with third differences, the first revised 
ordinate yY is given by 


Wi? = 0.74918 + y5[—0.52968 — 3(0.12950) — ,(0.00741) 

+ z(0.00079)] = 0.68968, 
which agrees with the initial prediction to the five places retained, so 
that iteration is not needed. 

Suppose, however, that in the more general case the result obtained 
ufter sufficiently many iterations of (6.3.2) is denoted by yay: and that 
the initial prediction, afforded by (6.2.10), is again denoted by y®,. 
If only third differences are retained, the calculated values of y®, and 
W/ny1 then satisfy the equations 


h , , , 
Yirhi = Un + 9g (55y;, — 59y,_1 + 37y%,_. — 9y/_s), 


(6.6.1) 
Yn = Un + 5q (9Ynia + 19y, — Syn + Ya), 
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where y/, = F(ax,yx), if the errors due to round-off are neglected. If the 
calculated values of y:, y2, - - - » yn and, accordingly, of yi, Ya - + + »Yn 
were exactly correct, then the true ordinate at Zn41, Say Ynyi1, would 


satisfy the equations 


= yn +2 (85y,, — 59yh1 + 87h-2 — Wha) + HOME, 
Vnsi = Yn t 5g (55Yn — 59Yn1 + 37Yn2 — MYn—s) T 730 025 
Yass = Yn (OV nas + 194 — Buea + Yes) — roby", 


where £, and & both lie between 2n_s and %ny1, and it would follow that 


Ynsi — hr = Feoh y" (Ex), 


(6.6.3) 
Yati — Ynti = » (Yea — Sen) = teoh®y’ (2). 


In addition, we have the relation 
Yaar Yn =F (tn41, Yui) —F (tnt1,Yn+1) = (Yat — Yn F. y (Ln41)Mn41)3 


where mn41 is between ynzi and Yay. If now it is assumed that h is 


sufficiently small to ensure that 


3 lr ycorwnerd| <I, (6.6.4) 


and also that y*(x) does not vary strongly for Ens < t < an41, 80 that 
y"(é1) and y"(£2) can be equated, to a first approximation, Eqs. (6.6.3) 


lead to the useful approximate relation 


—19 
Yaya — Yop ~ 251 + 19 (Ynti1 — Yrva 


or Yugi — Ynut © —a70Yn+1; (6.6.5) 
where Ynt1 = Ynti — Yor ys (6.6.6) 


Thus, if a column of the differences yr = Yx — yo is carried along 
in the calculation, the error in the final iterate ynti which is due to trunca- 
tion error in the step from 2, to 2141 can be estimated as —19¥n41/270 
~ —Yn4i/14. The reliability of this estimate depends upon the mag- 
nitude of AF, and of hy“ in the relevant range (see also Prob. 24). As 
will be seen in §6.9, the condition (6.6.4) is necessary in order to ensure 
rapid convergence of the iteration leading from Uirla to Yote Also, if the 
first neglected difference V‘yi,, does not vary rapidly with k, it can 
expected that the same is true of ’, 80 that hy’ is probably small relati 
toy’. Accordingly, if the first neglected difference does not vary exe 
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sively, if the rate of convergence of the iteration is satisfactory, and if 
‘x never exceeds seven units in the last place retained, then the effect of 
the truncation error in each step probably does not exceed one-half unit 
in that place. 

More generally, if (6.2.10) is used as a predictor for (6.3.2), the factor 
C = 7 ~ 14 corresponding to retention of third differences is easily 
seen to be replaced by 2 for no differences, 6 for first differences, 10 for 
second differences, 73° ~ 14 for third differences, and 592 ~ 18 when 
fourth differences are retained. That is, if the difference between the 
initial prediction and the final corrected value does not exceed half the 
value listed, in units of the last place retained, then the truncation error 
in each step probably does not exceed one-half unit in that place. 

We will refer to the method just described as the modified Adams 
method (it is also known as Moulton’s method). The procedure based on 
retaining only the first difference in (6.3.2) is often called the modified 
Euler method. 

Milne’s methods differ from the methods just described in that they 
use (6.3.7) for iteration and (6.2.18) for prediction, retaining second 
differences, or (6.3.8) for iteration and (6.2.19) for prediction, when 
fourth differences are retained. The truncation error in the nth step 
can be estimated as —(Yni1 — y,)/29 in the former case, and as 
=32(Yntr — Y6Q1)/1139 ~ —(Yyn41 — y{1)/35 in the latter case. These 
methods possess the advantage that the truncation errors in each step 
are proportional to h> and h’, respectively, whereas retention of only 
second or only fourth differences in the preceding method corresponds 
to truncation errors proportional to h4 and h‘, respectively. On the other 
hand, as will be indicated in §6.7, they compare unfavorably with the 
preceding method with regard to stability. 

It is obvious that each of the formulas considered could be expressed 
explicitly in terms of the values of the derivative y’, in place of differences 
of those values, once a decision was made as to the number of differences 
which were to be effectively retained. Thus, in particular, the Milne 
second-difference procedure can be based on Eqs. (6.2.18) and (6.3.7) or, 
equivalently, on the equations 


Ah 14h5 
Yntt = Yuna + Zz (2Yn — Yaa + 2Yn2) + GE yr(é) (6.6.7) 


h r hé 
and Yntt = Yn + 3 (Yaga + 4Yn + Yrs) — G9 YE)» (6.6.8) 
where, of course, the values of & in the two equations are generally 
unequal. The second equation is seen to be equivalent to Simpson’s 


rule, Similarly, the Milne fourth-difference procedure can be specified 
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by the equations 
= yas + 3 (Ligh — Mh + 2602 = Moma + Uh) 


41h? 
+p 6.6.9) 


Yn+1 


and 
Qh , v y t U 
yner = Yas + Fe (Tyr + 82yn + 12g a + 82Hn2 + 7Yn—s) 


7 = y(@), (6.6.10) 


i icti iteration, respectively. 
hich are to be used for prediction and for iteration, 
e 6.7 The Special Case F = Ay. Each of the formulas treated in the 
preceding sections is expressible in the form 


Yat = Yaw + hanya + Yn + oT) A sea as oYn—r) (6.7.1) 


where a_; = 0 for the formulas of open type, and where 


yp = F (xe,yr)- (6.7.2) 


Formula (6.7.1) corresponds to the retention of r + 1 differences in a 
closed formula or of 7 differences in an open formula. Thus, in pe 
ular, it reduces to an identity if y(a) is a a of ae r + 2 or 
less when a_1 = 0. 
less when a_: ~ 0, and of degree r + 1 or les 
In the case when the differential equation 1s of the very special form 


dy _ 
ie 


so that F(a,y) = Ay, where A isa constant, the relation (6.7.1) takes th 
form 


(1 = a1 Ah)yntpi = Yn-p + Ah(aoyn + Yn pa 8 © SP OrYn—r)s 


and is subject to a simple analysis, the results of which are helpful ’ 
understanding the propagation of errors in the more general case. 
may be noticed that the exact solution of (6.7.3), subject to the conditio' 


y (xo) = Yo, is 


y (2x) = yorA a7) " (6.7.5 


o fix ideas, we suppose that r 2 p and so include all the col 
sea formulas, such as formulas (6.2.10) and (6.3.2), | Be 
‘p = 0, and the formulas (6.2.17) to (6.2.19) and (6.3.7) and A 8), 
which r= p. As will be seen, this restriction 18 easily removed, . 

The relation (6.7.4) then affords 9 linear relation among the r +r 
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ordinates Yny1, Yn) Yn—1, + - + , and yn_, (one of which is identical with 
Yn») and is known as a difference equation of order r + 1, under the 
assumption that a_:Ah ¥ 1 and a #0. It holds only for n = r, the 
ordinate yo being prescribed, and the remaining r initial ordinates yi, 
Y2, » » + » Yr Supposedly being supplied by an independent calculation. 


We may notice that y, = 6" will satisfy (6.7.4), with 6 constant, if 6 
is determined such that 


(1 — a 1Ah)ertt = 6? + Ah(ooB + arb! + > + + + anf), 


or, after removing the common factor 6"~", such that @ satisfies the 
characteristic equation 


(1 — a 1Ah)prtt — Ah(ao8” + a6"? + +++ +a,) — Br? = 0. 
(6.7.6) 


Since p and r are nonnegative integers, such that r — p = 0, this relation 
is an algebraic equation of degree r + 1 in 6 and hence possesses r + 1 
roots Bo, B1, . . . , By, which may be real or complex. 

If no roots are repeated, then, from the linearity and homogeneity of 
the difference equation (6.7.4), it follows that 


Yn = CoBf + c:1Bt + °° + +," (6.7.7) 


satisfies (6.7.4) for arbitrary values of the r + 1 independent constants 
, Cr, Which are available for satisfying the r + 1 initial con- 
ditions which prescribe yo, yi, . . . , Yr It can be shown that (6.7.7) 
then represents the most general solution of (6.7.4), when n is restricted 
(o integral values. 

If 6: = Bs, the terms c,8? + ¢26% are to be replaced by 8%(c: + con), as 


is easily verified. Furthermore, if 8; and 6: are conjugate complex, so 
that 


Co, Cly ses 


Bi = pe’, B2 = pe~* 
where p = |6,:| = |@2|, we may replace c; and cp by #(c: + c2) and 
ji(¢, — 2), and rewrite the corresponding two terms in (6.7.7) in the more 


vonvenient form 
p"(c1 cos nd + ce sin n@). 


It remains to investigate the roots of the characteristic equation (6.7.6). 
We may notice first that, when h = 0, the equation reduces to 


part = 1)= 0, 


so that 8 = Ois then a root of multiplicity r — p, and the remaining p + 1 





204 INTRODUCTION TO NUMERICAL ANALYSIS 


roots are the (p + 1)th roots of unity. In the complex 8 plane (Fig. 6.1), 
r — p roots coincide at the origin, whereas the remaining p + 1 roots are 
equally spaced about the unit circle |8| = 1, with one root at the pee 
6 =1. Whenh is small, the r + 1 roots will generally be distinct, wit 





Im(g) 


Re(8) 











Fia. 6.1 


r — p roots near the origin, and p + | roots in the neighborhood of the 


unit circle. 
In particular, if we denote by 6o that root which tends to unity as h 


tends to zero, we may write 


Bo =1 + mh +- Meh? + ec hare (6.7.8) 


where the coefficients 1m, mz, . . . are to be determined in such a way 
that the result of replacing 6 by Bo in (6.7.6), and expanding the result in 
powers of h, reduces identically to zero. A simple calculation then 
shows that the result of that substitution is of the form 


A(p + 1)m, — Alan + ao + a1 + ne 8 ee ' ‘+ oe = 0, 


and hence, in particular, that we must have 


mM = i (a1 t+aotat:*:* 26 ar). (6.7.9 


But, since the integration formula which led to (6.7.1) certainly gives 
exact results when applied to the integration of a constant, that is, for 
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y’(x) = 1, we deduce from (6.7.1) the relation 


aitotat:::+a=ptl, (6.7.10) 


and so find that m; = A. Thus, one root of (6.7.6) can be expressed in 
the form 


Bo = 1+ Ah + O(H), (6.7.11) 


where the symbol 0(h?) represents a term which is small, of the order h?, 
when h is small. 


The corresponding part of the solution (6.7.7) is thus of the form 
col + Ah + ss ‘)" =o(1 + Ah+ ss 8 


and is approximated by coe4@--*) when h is small. Thus we see that this 
part of the general solution of the difference equation tends toward the 
general solution of the approximated differential equation as h — 0 and, 
indeed, tends toward the required solution, for which y(xo) = yo, if 
Co yo ash— 0. 

The remaining r terms in (6.7.7) represent so-called parasitic solutions 
which correspond to the fact that the order of the difference equation 
exceeds the order of the approximated differential equation by r. For 
small values of h, we have seen that r — 7p of the roots 8; will be small in 
magnitude, relative to unity, and hence that the corresponding terms 6? 
will tend rapidly to zero as the calculation proceeds and n increases. 

However, if p > 0, there are p roots in addition to Bo which are of unit 
absolute value when h = 0. If, for h > 0, any one of these roots, say 
by, has a magnitude greater than unity, then (unless the coefficient c, 
happens to vanish) the corresponding term c;6% will increase unboundedly 
in magnitude as n increases. 

In illustration, if use is made of the simplest formula of open type 
(HMuler’s formula), 


*) (en—z0)/h 


with p = r = 0, the only root is 8) = 1 + Ah, and hence the solution is 


Yn = Yo(l + Ah)” = yo(1 + Ah) G20, (6.7.18) 
which does indeed approximate (6.7.5) when h is small. 
lor the open formula with p = 0 and r = 1, 
h , 
Yet = Yn + (Ym + 3VYn) = Yn + 5 (B8Yn — Yrs) 
= (1 + $Ah)yn — gAhyn, (6.7.14) 


the characteristic equation (6.7.6) becomes 


6? — (1 + gAh)B + GAh = 0, 
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and yields 
Bo =F +4AR +h V1 + Abt FA? = 1+ Ah + ZAM’ + 
and 





B= +2Ah—4 V1 4+ Ant $A% = ZAM — At °° >). 
Thus, for small h, the solution is of the form 
Yn = co(1 + Ah + ar +) @n—z0) /h + c1(gAh + ote +) nao) (6.7.15) 


If co and ¢; are determined such that yn = coBj + e167 reduces to yo and 
y: for n = 0 and 1, respectively, there follows 


yi — BYo — Boo — Yr, TA 
fie = Ba” gq = Bo as Bi (6 7 6) 
The ordinate y; is assumed to be supplied by another method. If we 
assume that y; differs from the true value yoe“* = yoll + Ah +: 4, 
at worst, by an amount of order h, it is easily seen that co differs from yo 
and c, from zero by an amount at worst of orderh. Hence here the para- 
sitic solution is small when h is small, and also it tends to zero as n — © 
for any fixed value of h which is sufficiently small to make \B1| < 1L.t 

As an example in which p > 0, we may notice that if the Simpson’s 
rule formula (6.6.8) is used, in the form 


Co = 


You = Yn + 4h (Ynti + 4Yn + Yn—1)s (6.7.17) 


Eq. (6.7.6) becomes 


Ah 4Ah , _ 4) - 
(.- 4) - the (14-44) =9, 


with roots expressible in the forms 
Go=1+Ah+---, Bi= —1+ 3Ah+°::, 
when h is small. Thus the solution of (6.7.17) is expressible in the form 


Yn = co(d + Ah+-:-: +) @n—z0)/h + (—1)ei(1 es gAh +.:: -) @n—z0)/h 
re coeAee—=) 4 (— 1) "ce 14), (6.7.18) 


when h is small. 
When A is positive, so that the exact solution grows exponentially with 
z, the root f: lies inside the circle |8| = 1, and the parasitic solution 
accordingly damps out exponentially in magnitude, as the calculation 
proceeds with increasing n. However, when A is negative, so that the 
+If A = 0, the requirement |Ai| < 1 is satisfied for all h, However, if A < 0, the 
spacing h must be such that h < 1/|A], 
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exact solution tends exponentially to zero as x increases, the parasitic 
solution increases exponentially in magnitude and, in addition, alternates 
in sign from step to step in an advancing calculation. To a first approx- 
imation, c; is found to be half the difference between the value of y, used 
in the calculation and the true value yoe4*. However, the value which 
should be assigned to y; in order that c, vanish exactly is not the true 
value, but a value which tends to the true value ash—0. That is, the 
parasitic solution would be present even though yo and y: were exactly 
correct. 

It is important to notice also that each round-off committed at any 
stage of the advancing calculation will initiate a new parasitic solution, 
of the same type. 

In the somewhat more general case when the differential equation is 
of the form y’ = Ay + Bx + C, where A, B, and C are constants, a 
linear function of x is accordingly added to the exponential term present 
in the true solution when B = C = 0. It is found that. the.same mod- 
ification occurs in the solution of the approximating difference equation, 
so that a linear function of n is merely added to the right-hand member 
of (6.7.7). Thus the same parasitic solutions are present, and the preced- 
ing discussion again applies, except for the fact that here the true solution 
will not decrease in magnitude as x increases and when A is negative, but 
will grow linearly, while the parasitic solutions may grow exponentially. 

Finally, in the general case, when we are concerned with an equation 
of the form y’ = F(2,y), we may imagine that F'(2,y) is replaced by the 
linear approximation 


F(x,y) ~ F(an,yn) + (@ — tn) Fe(tnjYn) + (yY — Yn) PF y(Xn,Yn) 


in the neighborhood of a point (an,yn), and so imagine that the differential 
equation is replaced by the linear equation y’ = A,y + Bax + Cn, where 


A, = Fy (2n,Yn), B, = F.(2n,Yn), 
Cn = F(Xn,Yn) — En 2(tn,Yn) — Ynky(Ln,Yn)- 


It is then plausible (but not always true) that the nature of the error 
propagated in the numerical solution of the true equation will be siny 
ulated by that for the linearized equation, over a short range near wp. 


‘lhe situations in which no one of the parasitic solutions tends to increase 


in magnitude as the calculation proceeds from that point are often said 
to be characterized by short-range stability. 

In order to illustrate the occurrence of instability, we present in 
lable 6.2 the results of calculations based on the problem 


y’ + 2y = 2, y(O) = 2. 
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TaBLeE 6.2 

z YM YM YA YA YT 
0.0 2.000 — 2.000 — 2.000 
0.5 | 1.368 — | 1.368 — 1.368 
1.0 1.132 — 1.1385 — 1.185 
1.5 1.052 — 1.046 — 1.050 
2.0 | 1.014 —42 1.016 —64 1.018 
2.5 1.012 —36 1.005 —17 1.007 
3.0 0.995 15 1.002 —10 1.002 
3.5 1.011 —33 1.001 —2 1.001 
4.0 0.986 40 1.000 —2 1.000 
4.5 1.020 —57 1.000 -1 1.000 








The entries in the column headed yu are values determined by the 
Milne method, using (6.6.7) for prediction and (6.6.8) (Simpson’s rule) 
for actual calculation. The entries yu represent the differences between 
the final results and the initial predictions, and —ym/29 affords the Milne 
estimate of the truncation error in each step. The entries in the column 
headed ya were obtained by the modified Adams method, using (6.2.10) 
with third differences as a predictor and (6.3.2) with third differences for 
actual calculation. The estimated truncation error in each step is 
afforded by —vya/14. The entries in the yr column are values of the 
true solution y(z) = e-™* + 1, rounded to three decimal places. As has 
been noted, both numerical procedures introduce truncation errors of 
order h® in each step. 

A large spacing is chosen deliberately, and all calculations are rounded 
to three decimal places, in order to cause the effects of the error propaga- 
tion to become evident at a relatively early stage of the process. The 
requisite starting values (above the broken lines) are correct to the places 
retained. 

The tabulation is intended, not only to show the increasing oscillation 
of the first solution about the true solution, but also to serve as a reminder 
that the quantity —ya/29 affords only an estimate of the truncation error 
introduced in each step but does not, in itself, indicate the manner in 
which the effects of that error are propagated. Thus, for example, the 
fact that —ys/29 is smaller than 2 in each step must not be interpreted 
as indicating that the accumulated error at each step is less than 2 units 
in the last place. i 






















In fact, that error is seen to amount to —20 units in 
ym(4.5), in the present case. : 

6.8. Propagated-error Bounds. In actual calculation, the calculated 
value of yn41 generally will not be given exactly by the right-hand member 
of the relevant formula (6.7.1), because of the necessity of effecting 
round-offs. If we replace Jari by Yara + Ry, where R, is inserted to 
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account for the effects of round-off in th 
ee eager esac e nth step, Eqs. (6.7.1) and (6.7.2) 


r 


Ynti = Yn—p + h nF (Ln—k;Yn—k) ad Ry. 


k=—1 


(6.8.1) 


On the other hand, if we denote th 
: e true value of the solution of the gi 
problem when x = 2; by Y;, we have also the relation eee 


r 


Yagi = Yancy +- h onl (nn, Yn-x) + Tn, 


k=—1 


(6.8.2) 


where we here denote th J i 
ae e the truncation error corresponding to the nth step 


If we subtract (6.8.1) from (6.8.2) and write 


ae, = Y, — Yn, E, ~ Tr + Ra, (6.8.3) 


r 


€ntl = Exp th onl (nz, V ne) — F(tn—2,Yn—z)] + En. 


k=-1 


(6.8.4) 


In order to obtain a bound on t i i 
ae on the magnitude of e,, we notice first that we 


F(x:,¥i) — F(aiys) = (Vi — ysdF (xin) 
ef, (xi,ni), (6.8.5) 


if F, = dF /dy is continuous, where 7; i 
+ 1s bet ; i 
can be written in the form ” a 


r 


[1 — hos, (arinnra)lénti = exp tA > oxen aE (In-2,n-z) + E 


K=0 
(6.8.6) 


Suppose now that, for the ran. : 
, ge of values of : 
over-all calculation, we have BANS a revolved mm the 


|F,(z,y)| S$ K, (6.8.7) 


whe i i 
here K is a known constant, and consider the related difference equation 


(1 — Khja-s|)enz1 = €n—p + Kh > loxlen—e + E, 
k=0 


(6.8.8) 
where Z is such that 


\Z.| SE (n=rrt+il,...). (6.8.9) 
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If a1 ¥ 0, so that the formula is of closed type, suppose also that h is 
sufficiently small to ensure that 


Kh\a_i| < 1. 
From (6.8.6) and (6.8.7), we have 


(6.8.10) 


jl — hosFy(3n41;2n+1)| lenta| S len—»| + AK ), Jar| ere] + |Eol, 
k=0 


: , lens] S Cnr, there follows 


and hence, if |én| S én, |én—1| S @n—1y 
also 
jl -— hoFy(2n41,n041)| lend] S C1 — Khlo_1|)ens1, 


and thus |en41| S ény1. That is, if |e| S ei forr + 1 oe 
values of i, then, by induction, the same is true for all succeeding inte- 
f 7. ; 
ees ee €) vanishes except for round-off since yo = Yo is pre- 
scribed. Also, again assuming that r 2 p, the errors «1, . sy Gr are 
errors associated with the starting values y1,..-, Yry supplied by 
independent analysis. Let é be a positive number which is not a } 
in magnitude by any of these initial errors. Then, if én is : 7 u one 
(6.8.8) which is not smaller than é for n = 0,1,..-.-,7, it follows tha 


en < Cn 


for all relevant values of n. That is, any such solution of (6.8.8) will 
“dominate” the solution of (6.8.4). 
Since the nonhomogeneous term £ in (6.8.8) is a constant, a particular 


solution of (6.8.8) may be assumed in the form en = —A, where A is @ Con 
stant, and the introduction of this assumption leads to the determination 
? 
E 
A= Khe’ 


with the additional abbreviation 


¢ = |ax|. 
k=—1 
It may be noticed that ¢ = p + 1 when all the a’s are positive, and thai 
o =p +1 in any case, in virtue of (6.7.10). ‘ 

To this particular solution may be added any multiple of 6", where 
is determined such that 6” satisfies the homogeneous difference equati 
obtained by replacing E by zero in (6.8.8), and where B accordingly mu ) 
satisfy the characteristic equation 


(1 — Khlai|)p"*? — Kh(\ao|8" + laalB'! + + \atr—1|B + \arrl) 
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Since the left-hand member is negative when 6 = 1 and tends to + © ag 
8— +o, there is a positive real root 8) which is larger than unity. 
Indeed, for small values of h, it is found to be expressible in the form 


Kho 
p+l 


With this value of 8, and the value of defined by (6.8.11), it follows 
that en = cB% — d satisfies (6.8.8) for any constant value of c. In addi- 


tion, since it increases steadily with n, if we determine c in such a way that 
eo = @, so that 


Bo=1+ + O(h?), (6.8.14) 


€n = BF + A(Br — 1), (6.8.15) 
then we will have e, = éforalln = 0. Thus this particular solution of 
(6.8.8) will dominate the solution of (6.8.4). 

Hence, in summary, we deduce that the error €, associated with the value 
of Yn determined by step-by-step calculation based on the formula 


Yn+1 ~ Yn—p + h > Onl (Cn—K,Yn—k) 


k=-1 


(n = r) (6.8.16) 
is limited by the inequality 


len] S &83 + (BR — 1), (6.8.17) 


where @ is the absolute value of the largest error associated with the r + 1 
slarting values Yo, yi, . . « , Yrs X is defined by the equations 


r 


r oe, c= |orx| ; 
o 


k=—1 
K is the maximum value of |aF /dy| for the range of values of x and y involved 


in the calculation; E is the absolute value of the maximum total error intro- 
tluced in each step; and Bo is the positive real root, of the equation 


(6.8.18) 


prt = p"-? a Kh > |ax|B"—*, 


k= -1 


(6.8.19) 


which exceeds unity.t 

In those cases when the coefficients a, are all positive, reference to 
(0.7.10) shows that (6.8.11) reduces to \ = E/[Kh(p + 1)] and that the 
#xpansion (6.8.14) becomes By = 1 + Kh + 0(h2). Also, the character- 
istic equation (6.8.13) is then equivalent to Kq. (6.7.6), with A replaced 
hy K, and Br = exp: [K(an — ao)]. 


| If 0F'/dy is known to be negative throughout the calculation, a less conservative 





found often can be obtained in a correspondingly simple form, 
Mroba, 19 and 20, 





For example, see 


-p-r =, (6.8.18 
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Of the three constants 2, E, and K, needed for the application of this 
error estimate, the first may be estimated initially, and it represents the 
maximum round-off error associated with the initial values determined 
before the stepwise calculation is begun, if those values are correctly 
determined to the number of places retained. 

The constant E comprises the maximum error introduced in one step 
because of round-off and truncation. The latter effect cannot be esti- 
mated in advance unless F'(z,y) is of a particularly convenient form, but 
it can be estimated in the course of the calculation by approximating the 
factor h™y™(£) by h Avy’ in the truncation-error term, or by making 
use of the quantity y» defined in §6.6 if one of the methods described in 
that section is employed. 

The constant K can be calculated in advance if the equation is linear, 
since then dF'/dy is independent of y, and it can be estimated in advance 
(assuming that an analytical expression for dF /dy, in terms of x and y, 
can be obtained) if the range of values of y can be estimated initially. 
Otherwise, sample values of d/’'/dy can be tabulated as the calculation 
proceeds. Thus, for example, in the case of (6.4.5) we have dF /dy = —1, 
and hence K = 1. For the equation y’ = 2” + y?, K would be estimated 
as the largest value of 2\y| encountered in the calculation. 

The maximum effects of errors due to truncation and to round-off can 
be treated separately. However, because of the more or less random 
fluctuation in sign of errors due to round-off, any upper bound on the 
over-all effect of a large number of round-offs, no matter how precise, 18 
likely to be extremely conservative in any actual calculation. On the 
other hand, the statistical analysis of such effects in stepwise integration 
is rather involved and, in any case, can afford only the probability that 
the over-all effect of round-off errors will not exceed a certain amount. 

6.9. Convergence of Iterations. When the formula used is of the 
closed type, its solution generally must be approximated by iteration, 
was seen in §6.6. It is then important to determine conditions unde 
which this iteration converges and to investigate the rate of convergence, 

If we denote the ith approximation to yn41 by y®,,, the iteration i 
described by the equation 


YStP = haF (tn+1,YSh1) + Yn +h J, ak nits), (69-1) 
k=0 


+ It should be noticed that the estimate oF /ay ~ Ay’/Ay (which has been sugges’ 
in the literature) is not generally significant. Whereas we do have the relation 


Ay, S [F(an41,Yn4) = F(an41,¥n)) + [F(an4iyYn) = F (anj¥n)) 
od Fy(an41,¥n) AYn + AF o(2njYn)) 


there is no reason to suppose that the last term in small relative to Ay’, 
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whereas Yn+1 itself satisfies the equation 


Yrui = haaF (Xn41,Ynt1) + Yn—p +h , an (2n—1,Yn—r)- (6.9.2) 
k=0 
When we subtract (6.9.1) from (6.9.2), and suppress the subscript n + 1 
in order to simplify the notation, we obtain the relation 
y — yF = hanly — y)F,(2,n®), (6.9.3) 


where n® lies between y®, and yn41. If, in the neighborhood of (%n41, 
Yn41), we have |F,(z,y)| S Kn41, there follows 


ly — y?| S hlorlKnsily — y|, (6.9.4) 
and also, by induction, 
ly — y*| S (hlaalKnsi)ly — yl. (6.9.5) 
Hence, if h is sufficiently small to ensure that 
1 
h<—>— 
a ae (6.9.6) 


in accordance with (6.8.10), the error in the 7th iterate tends to zero as 
i increases. 

The rate of convergence is specified by the ratio of the magnitudes of 
errors of successive iterates, and it is seen that this ratio is approximated 
by the absolute value of the ‘‘convergence factor” pn;1 such that 


(6.9.7) 


l’or example, if Milne’s method, based on (6.3.7) or (6.6.8), were to be 
used in dealing with the problem 


pn = hark, (Xn,Yn). 


y(0) = 1, 


wo would notice that, near the beginning of the calculation, 


Peay, 


Fy, = —2y ~ —2. 


Mince here a; = 4, from (6.6.8), the convergence factor in the early 
steps would be about —2h/3. Thus, with the choice h = 0.1, each iterate 
would tend to deviate from the limiting value by about one-fifteenth of 
the deviation associated with the preceding iterate. 

It, should be carefully noticed that the smallness of the difference 
between successive iterates at a certain stage is a measure only of the 
degree of approximation to which the difference equation employed for the 
approximate integration is satisfied at that stage, and clearly must not 
hoe considered in any sense as a measure of the difference between an 
iterate and the true solution of the differential equation. Indeed, there 
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obviously is no certainty that the ultimate result of the iteration, as 
applied, say, to (6.6.8), will be a better approximation to the true solution 
of the differential equation than the initial prediction afforded, say, by 
(6.6.7), although the remainder terms indicate that the odds are about 
28 to 1 in its favor. 

6.10. Application to Equations of Higher Order. In order to apply 
one of the preceding methods to a differential equation of higher order, 
it is often convenient first to replace that equation by an equivalent set 
of equations of the first order. We here illustrate the procedure only in 
the case of a second-order equation, after which the generalization to 
higher-order equations, or to sets of simultaneous equations of more gen- 
eral type, will be obvious. 

The problem 


y" =G(ayy’), (eo) = yo, —-"(@0o) = Yo (6.10.1) 

is equivalent to the problem 
y' =u, y(xo) = Yo 6.10.2 
wl =Gleyu), — ula) = ah, oie 


which is, in turn, a specialization of the more general problem in which u 
is replaced by, say, F(z,y,u) in the right-hand member of the first 
equation. 

It is usually convenient, but not necessary, to use the same formula in 
dealing with the two equations in (6.10.2). The approximate formulation 
then comprises two relations which are expressible in the general form 


Ynvr = Yn-p + haha + ooyn + oryhat ++ tart) (6.10.3) 
and 
Unti = Un—p + h(a-itiny, + ott, + ayy toc + attr r)y (6.10.4) 


or in equivalent forms in terms of backward differences, with 


Yn = Un 


and Un = E(Xn,YnyUn). 


In any case, the relations (6.10.3) and (6.10.4) apply only for n 2 
the values yo and wo being given and the values yi, . . - ,Y andw,... 
u, being obtained by another method (such as the use of power series OF 
of one of the methods to be given in §6.16). The values of uj, ui, . « + 
ul are calculated in advance, from (6.10.6). If the formula is of 0 
type, so that a_; = 0, y-41 and w,41 are then calculated directly by ui 
of (6.10.3) and (6.10.4). Next y/,, is given immediately as Uy}1, & 
ul,, is calculated as G(ap+1,Y¥r41sUr41), 80 that data are then available f 
advancing by another step. 
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Tf the formula is of closed type, so that a_, ~ 0, an initial prediction 
u;?, 1s first obtained (by use of a supplementary formula of open type, by 


pure estimation, or otherwise) and y?, is obtained by replacing y/,, by 
us, in (6.10.3). Next u/{} is obtained from (6.10.6), with y and u 
replaced by their zeroth approximations, and the cycle is closed by cal- 
culating us, from (6.10.4). If the calculated value differs from the 
initial prediction u{),, the cycle is iterated until agreement is obtained 
when the iteration converges. The next step is then taken in the carne 
way. 


The iteration is thus described by the equations 


Yds = Yn—p + A(a_w, + + - +), 


MG) i 9 

Unt aa G(Gn41,YO1,US 1), (6.10.7) 
G = : 

ULED = Unp + h(a, + + + -), 


where Yn—p) Un—p, and all omitted terms remain fixed throughout the 
iteration. There then follows also 


y — y® = haa(u — u®) (6.10.8) 


and 
u-— ut? = halG(r,y,u) — E(r,y,u)], (6.10.9) 


where the common subscript n + 1 is suppressed throughout. Now if, 
NOar (2n41,Yn+1;Un41), it is true that 
IG, (x,y,u) | Ss Kaya, IGu(x,y,u)| = Lays, 


then we may deduce from (6.10.9) the inequality 


(6.10.10) 


ju — WD! S hloal[Kaply — y®| + Lngalu — uJ, (6.10.11) 
and hence, making use of (6.10.8), 
Ju — uD] < hla_s|(hla_i|Kagi + Lngi)iu — u|, (6.10.12) 


lhus, convergence will attain if A is so chosen that 


h\a_1| (hla_i|Kn41 + Lin41) < ae (6.10.13) 
and the “convergence factor” p, in the nth step is such that 
lpn| S hla_s|(hla_s|K + L), (6.10.14) 


where K and L are upper bounds on |G,| and |@,|._ If @ does not explictly 
involve u = y’, it is seen that the convergence factor is of second order 


inh. 


In order to illustrate the procedures, we consider the simple problem 


yomytay’, yO=1, yO) =0. (6.10.15) 
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In order to start the calculation, we first obtain the expressions 

yl =ytay, Y= 2 tay", ya By tay, 
and hence, with x» = 0, 
y=1, 4 =0, W=1, yw’ =90, yr =3, H=0, w= 15, o's 
so that, with h = 0.1, there follows 


they PEN 1f/s\ ora 
w= 1+5 (50) +3(in) +a) + 


Si -~ EGE Tey 
t= i+ a(t) it ae 


Thus we may obtain, in particular, 
yi = 1.0050, y2 = 1.0202, ys = 1.0461, “" "5 610.16) 
y, = 0.1005, yz = 0.2040, y, = 0.3138, ae 


if only four places are retained. For the purpose of simplicity, we make 
use of the calculated values of y: and yj, and proceed by using a formul: 
involving only first differences. 

The preliminary calculation may then be arranged as follows: 


a y y! =u vy’ v2y’ y” =y + cu vy" vy’ 








0.0 1.0000 0.0000 oem — 1.0000 — a 
0.1 1.0050 0.1005 1005 =— 1.0150 150 — 
If the Adams formula of the open type is used with first difference 
there follows 


1.0050 + 0.1[0.1005 + 3(0.1005)] = 1.0201, 
0.1005 + 0.1[1.0150 + 2(0.0150)] + 0.2028, 


v 


Y2 
U2 


R 


and the third line of the calculation is 


0.2 | 1.0201 | 0.2028 | 1023 18 | 1.0607 | 457 307 
(6.10.1 


Since a second difference V2y" of about 300 would contribute ab 
+s-7y'300 + 12 units to y’, while a second difference V*y' of about 1 
units would contribute about 0.7 units to y, if we suppose that the ni 

lected second differences relative to x: are of the same order of magnitu 
as those calculated here, we may consider these quantities as roug 
estimates of the truncation errors in yz and we, introduced in the st¢ 
from x; to 2 Further information with regard to the reliability of t 
estimates would be afforded, in succeeding steps, by a consideration 
the extent to which the second differences remain constant. If su 
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errors are not tolerable, and this method is to be used, it is then necessary 
to calculate one or more additional starting values of y and u, and to 
retain at least one more difference. In this connection, it should be kept 
in mind that the errors introduced in each step are propagated into 
succeeding calculations, as was seen in the first-order case in §§6.7 and 
6.8, in a manner which depends both upon the problem involved and the 
integration formula employed. 

If, instead, the Adams formula of closed type is used with first differ- 
ences, with the open-type formula as a predictor, the value 0.2028 is 
obtained, as in the preceding method, as the zeroth approximation wu‘ 
= y{®, The corresponding difference Vy; is then entered, y{ is deter- 
mined by the formula 


yS = 1.0050 + 0.1[0.2028 — $(0.1023)] = 1.0202, 


and ui is determined as y$” + sey} = 1.0608, so that the third line 
takes the form 


0.2 | 1.0202 | 0.2028 | 10233 — | 1.0608 | 458 — 
(6.10.18) 
Next the cycle is closed by calculating 


us? = 0.1005 + 0.1[1.0608 — $(0.0458)] = 0.2043. 


Since this result differs from the initial prediction, the entry 0.2028 in 
the third line is altered to 0.2043 and the cycle is repeated, at the end of 
which the third line has been changed to 


0.2 | 1.0202 | 0.2043 | 10388 33 | 1.0611 | 461 311 
(6.10.19) 


I‘inally, the value u§ is calculated and is found to agree with u{?, to four 
places, so that the iteration is completed. 

Reference to (6.3.2) shows that incorporation of the second differences 
would contribute —5-4y-311 ~ —3 units to y, and —yp-7y-33 ~ 0 
units to yz A somewhat more dependable estimate of the truncation 
error introduced in a single step can be obtained by calculating the 
inival approximation y{, by use of the open formula, in place of the 
vlosed one, but using the closed formula in subsequent iterations. Since 
ilno Ww, is calculated by the open formula, it then follows that if we write 


Yot1 = Yasir — YS, Ynut = Unt — Ui, 
Where Yny1 and un; are the values provided by the closed formula after 
Appropriate iteration, then the desired truncation errors in yn41 and Un41 


ure approximated respectively by —Yn4:/C and —vy},,/C, where C is 
the numerical factor considered in §6.6, here equal to 6 (see also Prob. 28). 
It in convenient to tabulate y, and yj, in place of the two first, neglected 
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differences, so that the line (6.10.19) then is replaced by 
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/ 


Y Y 
0.2 | 1.0202 | 0.2043 | 10381 1.0611 461 15 


The estimated truncation errors introduced into the calculated values of 
y; and y2 are then —7. ~ —3 units and —% ~ 0 units, as before. 

It may be noticed that the actual errors, obtainable by reference to the 
rounded true values given in (6.10.16), are indeed correctly predicted, in 
this case, by the estimates afforded by both procedures. 

In a fairly lengthy sequence of steps, however, the propagation of errors 
becomes particularly important. This rather unpleasant problem is con- 
sidered, for the second-order equation, in the following section. 

In this connection, it may be remarked that an elementary analysis 
quite similar to that of §6.7 permits a study of the situation corresponding 
to the use of (6.10.3) and (6.10.4) in the numerical solution of an equation 
of the special form 

y’ = Ly’ + Ky, 


where L and K are constants. 


(6.10.20) 
Here the exact solution is of the form 


y(x) = cre4* + cge4, (6.10.21) 


where A, and A, are the roots of A? — LA — K = 0 and ¢c, and cp are 
determined by the initial conditions, and it is again found that the use of 
formulas (6.10.3) and (6.10.4) with p > 0 introduces “‘ parasitic solutions” 
which may dominate the part of the solution which simulates the exact 
solution when A, and A, are negative or have negative real parts. When 
p = 0, this situation can exist only when excessively large spacings are 
employed. 

In particular, if use is made of Milne’s method, based on (6.6.8) and 
corresponding to p = 1, the generated numerical solution is found to be 
approximated by 


C eA + C e427 + (—1)*[C3e-41%#73 + C ge 427#/3] (6.10.22) 


when x = 2;, if the spacing / is small and if round-off errors are neglected, 
where the C’s are determined by the starting values. Thus, for example, 
if the true solution is of the form 


y(t) = ce~*** cos (ba, + w), 


where a > 0, then the parasitic part of the numerical solution will be 


approximated by 
(—1)*c’e™* cos |) + w'| 


and will tend to dominate the desirable part of the numerical soluti 
when « is large. 
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It is also important to notice that the propagated error generally will 
possess components simulating both the terms e4'* and e47, even when 
the parasitic solutions are not troublesome. This situation is of partic- 
ular disadvantage when the initial conditions require that the exact solu- 
tion involve only the term which grows least rapidly (or decays most 
rapidly). 

When the governing differential equation y” = G(z,y,y’) is less simple 
than (6.10.20), a qualitative analysis of short-range error propagation 
near a point (x:,yx) generally can be obtained by identifying L and K in 
(6.10.20) with the values of dG/ dy’ and dG/dy, respectively, at that point, 
if those partial derivatives do not vary excessively near that point. 

6.11. Propagated-error Bounds. If the true values of the solution and 
its derivative at x, are denoted by Y, and U,, respectively, and if the 


approximate values are calculated from (6.10.3) and (6.10.4), then we 
have the relations 


Ynti = Yn—p + h » AkUn—~ — Ra, 
— (6.11.1) 


Ves + h aK Une + yh 


k=-1 


Yui = 


and 


id 
Unt = Un—p + h > OG (Ln—2yYn—kyUn—k) = Ri, 
k=-1 


: (6.11.2) 
Un» + h OG (Xn—ky Yak, Un-%) + Te; 


k=—1 


On41 = 


where R,, Ri, T,, and 7’, are round-off and truncation errors introduced 
in the nth step itself. If we write 


én = Yn — Yn, e, = Un — un, (6.11.3) 
and E, = Tr, + Ra, E, = T+ Ri, (6.11.4) 
we obtain, from (6.11.1) and (6.11.2), the relations 
Ent = Exp th are, + En (6.11.5) 
k=—1 
and fnt1 = &x_p th an (Gy, .€n—- + Gy, 6, ,) + EY, (6.11.6) 
k=~1 
where G,,_, and Gy,_, are values of G, and Gy respectively, for x = Tn—k, 
some y between y,_, and Y,_,, and some y’ between up, and Uy_,. 


We now suppose that, for all values of x, y, and u involved in the cal- 
(ulation, we have 


IG(a,y,u)| SK, 


IGy(a,y,u)| SL, (6.11.7) 
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where K and L are fixed constants, and consider the associated relations 


enti = np th > lenge et (6.11.8) 
k 


=-l1 


and ear = ep th Y lal(Kens + Lens) + 2’, 


k=-1 


(6.11.9) 


where E and E’ are fixed constants such that 
\E,| < E, |#,| S$ EB’. (6.11.10) 


Then, by an argument similar to that of $6.8, it follows that if |en| S én 
and |e,| < e, forn =0,1,...-,7 and if 


hloa\(hloa|K +L) <1, 


then |e,| < en and |e,| < ¢;, for all positive integral values of n. 
In order to obtain a solution of (6.11.8) and (6.11.9) with this property, 


(6.11.11) 


we first seek a particular solution in the form én = —n, en = —)’, where 
\ and 0’ are independent of n, and so obtain the determination 
a er 6.11.12 
aay <a ae (6.11.0 
where again c= > lox. (6.11.18) 
k=—1 


Upon this particular solution we may superimpose 


en, = AB", e} = A’B", (6.11.14) 


if the constants A, A’, and 6 are determined so that these expressions 
satisfy (6.11.8) and (6.11.9) when E and E’ are replaced by zeros and 
hence so that 


A(grt — pr) — hd’) lelar* = 0 
k=—-1 
and 


KA J. jlort + A’ (an — pre — hb) lax|6"-*) = 0. 
=— k=-1 

—_ (6.11.16 

These equations are compatible, with not both A and A’ zero, if and onl 

if the determinant of the coefficients of A and A’ vanishes, that is, if 

satisfies the characteristic equation 


| prt -" pr? -_ Z| ax|p"-* 


LL : 
—hK Zlan|ar-* prt? — pr? — AL Zloulpr™ (6.1te 


= 0. 
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Now, when # = 1, the left-hand member reduces to 
0 he. |. sey 
| Lk ayy | 


and is therefore negative. For large positive real values of 8, the highest 
power of 8 will dominate the expansion, and its coefficient is seen to be 


l —h\a_| | Pat gaa 
| —hK|e_1| 1—- hL\a_1| = 1 hla_s|(h\|a_a|K +L), 
which is positive when h is sufficiently small to ensure that 
hla_s|(hlaa|K + L) < 1, 


in accordance with (6.11.11). It is interesting to notice that this is also 
the requirement that (6.10.12) hold for all n, so that the iteration relevant 
to a closed-type formula is convergent. 

Thus, if (6.11.18) is satisfied, there exists (at least) one positive real 
root of (6.11.17), say Bo, which exceeds unity. 


(6.11.18) 


If we write 
Bo = 1+ mh + mh? +: : 
and expand the elements of (6.11.17) in powers of h, there follows 
(p+ 1l)mhA+--:: -ho+::- 
—hKo+-+>  ((p+1)m—Lolh+--- 
and hence, in particular, the requirement that the leading coefficient 
(of h?) in the expansion vanish leads to the equation 
(p + 1)?m? — (p + 1)Lom, — Ko? = 0, 
(he positive root of which is 
POM SY "is VL? + 4K 
A ; 


pti 2 
‘Thus we have the result 


Bo = 1+ mh + 0(A2), 


} 


= 0, 





(6.11.19) 


(6.11.20) 

for small h. 
When 6 = Bo, the two equations (6.11.15) and (6.11.16) become equiv- 
alent, and either can be used to express A in terms of A’. 


If we use the 
former, we find that 


,_A 
are (6.11.21) 
h Y Jove|(85—* 
, CEE Ca ss a ER Sera 
where m apt i Cea + O(h). (6.11.22) 
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Thus with 8) and py so defined, Eqs. (6.11.8) and (6.11.9) are satisfied 
by 
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€n = ABT —2, = “63 —N, (6.11.23) 
where \ and X’ are constants defined in (6.11.12), and where A is an 
arbitrary constant. These expressions will dominate the solutions ¢, and 
é, of (6.11.5) and (6.11.6) for all n if they do so forn =0,1,...,7 
If we denote the maximum values of |e,| and |e/,| for n S 7 by é@ and 2’, 
respectively, it remains only to choose A such that 
Stee 
LM 


A-—d 


IV 


Thus, if we take 


A = max (2 + Be) L (@ + FYI 


the expressions defined in (6.11.23) will exceed @ and é’, respectively, 
for n = 0 and, since A and pu are positive and Bo > 1, will continue to do 
so for n > 0. 

It follows finally that, with A defined by (6.11.24), the errors en and €, 
associated with the use of the formulas (6.10.3) and (6.10.4) are limited by 
the inequalities 


len| < Apr = 


(6.11.24) 


E’ — LE 


6.11.25 
Khe ( ) 


» les 43— = 

The relative simplicity of these forms is attained at the expense of con- 
servatism of the bounds. Less-conservative bounds would be obtained 
in a much more complicated form if the general solution of (6.11.8) and 
(6.11.9) were obtained, with 2r + 2 independent arbitrary constants, and 
if those constants were determined such that e, = |en| and e), = |e,| for 
n=0,1,...,7, or if (6.11.8) and (6.11.9) were replaced by dominat- 
ing equations of less simple form. 

In those cases when the coefficients a; are nonnegative, so that 


o=p+l, 
in accordance with (6.7.10), Eqs. (6.11.19) and (6.11.20) give 


Geert : (L + VL? + 4K) + 0(h%). (6.11.26) 


Hence there follows 


(ayo) /h 
BF = A [: + ; (L + Vi8+4K) +: | 


me gilt VENFAR) (an= a0) /2, (6.11.2 


NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS 223 


when h is small, so that the dominating functions in (6.11.25) grow in the 
same way as exp [3(L + ~/L?+ 4K)z,] as n increases. 
6.12. Special Second-order Equations. Second-order equations of the 
special form 
y"’ = G(a,y), (6.12.1) 


in which y’ is not explicitly involved, arise very frequently in practice. 
If values of y’ are not required, it is desirable to have available methods 
which do not entail their calculation. 

Two formulas having this property were derived, as formulas (5.5.11) 


and (5.5.12) for repeated integration, in §5.5, and may be written with 
the present notation in the forms 


Yr = 2Yyn — Yn-1 + hl + OV + TsV? + pov? -f- tev! 


+HtoVvit sy! (6.12.2) 
and 
nti = 2Yn — Yai + 71 — V + ryV? + OV? — ginV4 

—st0V + ++ yi. (6.12.3) 


The former is of open type, the latter of closed type. In order to use 
cither, a suitable number of preliminary ordinates must be calculated by 
another method, which takes into account the fact that y and y’ are 
prescribed at « = xo, after which the technique of the ensuing calculation, 
often known as Stérmer’s method, is evident. 

Formulas (6.12.2) and (6.12.3) are each representative of a whole 
(lass of similar formulas, one class of open type and the other of closed 
(ype, which are analogous to the formulas given in §§6.2 and 6.3. In 
particular, an additional formula of open type, 


Writ = Yn + Yn—2 — Yn—3 + 3h7(1 —-V+ TV? + OV? + Tr0V' 
+qaov its Dy 
may be listed (see Prob. 11 of Chap. 5). 
l"ormulas (6.12.3) and (6.12.4) comprise a pair of formulas for both of 
which the coefficient of the third difference vanishes. If only second 
(lifferences are retained in these formulas, they become 


(6.12.4) 


4 + 3h (yn ~ Vun + ¥sVy") 17h6 
Vari ™= Yn Yn—-2 — Yn-3 h? we avi 
* (ul! + 2ytts — Syl { 7 240 7") 
(6.12.5) 
Ani 
» by r 4 shee enya + TeV Yn hs 
na = an Yn—1 pampered Sake y" | 
Tp (ein + lOve + yyy f — 240" 


(6.12.6) 
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where the coefficients of the remainder terms are the same as those of the 
omitted fourth differences in the formulas (6.12.3) and (6.12.4). The 
error term when one of the formulas is truncated with a difference not 
preceding one with a zero coefficient is of more complicated form. 

Milne’s method employs (6.12.5) to afford an initial prediction y{,, 
and (6.12.6) as the basic formula, to be solved for yny1 by iteration. If 
the factor Yn41 = Yati — YS, Is calculated (as in §6.6), the estimated 
truncation error in the nth step is seen to be Tn ~ —v7,/18. Also, the 
convergence factor in the iteration at the nth step (see §6.9) is easily 
found to be approximately pn ~ rgh?Gy(%n,Yn), 80 that h should be suffi- 
ciently small to ensure that |pn| < 1. 

In order to illustrate the relevant analysis of error propagation, we 
consider the special case in which (6.12.6) is the basis of the method.f 
With the same notation as was used in earlier developments, it is easily 
seen that the error ¢, associated with the calculated value yn satisfies a 
relation of the form 


h? 
En+1 = Zen — €n-1 + 12 (Gy yy s€nt1 + 10G,,€n + Gy,-1€n—1) a En, (6.12.7) 


for n = 1, where G,, is an appropriate value of G,. This relation can 
also be written in the form 


h2 I 
(1 Grn) (€n44 = En) = €n — €n—1 + 12 [(Gynss 2 10Gy,,)én 


ae 
+ G,,-1€n—11 + En, (6.12.8) 
so that, if we have 
IG,(x,y)| = K, (6.12.9) 


for all relevant values of x and y, and if h is sufficiently small to ensure 
that 
Kh? 
2 <1, (6.12.10) 


there follows 


Kh? Kh? 
(1 — xn) lengi — én| S len — €n—1| Ts 2 (11 en 3 |én—11) = |r. 


If e, satisfies the relation 


Kh? : 
( ‘os ) (€n41 — €n) = Cn — Cn-1 -+- ses (lle, + €n—1) + E, 


where E= | Ee sel anaes 
and if oe leol, a—-eMe ler ~~ eal; 
_ | The method used for the initial prediction at each step is irrelevant to this anal 
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there follows also e: — eo 2 |e1| — |eo| and hence e; — |e:| = eo — |eo| or 
é: 2 |a|. Then, by comparing (6.12.11) and (6.12.12), we find that 
e2 — @1 2 |ee —e| and hence e2 — e; = |eo| — |e| or e: = lel. B 
induction, there follows en — én-1 = |én — €n—1| and e, = len| for ‘ = i 
1,2, ... , so that the error e, is then dominated by Cn 
The general solution of (6.12.12) is readily found to be of the form 


= nm n E 
€n = Ao + AiBt — Khe (6.12.15) 


where fo and £; are the roots of the equation 


_ Kh 5Kh? Kh? 


and hence 


Ps ry a Kh? + »/Ki? + 2K 
Bi 1 — <,Kh? 





=1+ VKh + 0(h?). 


(6.12.17) 
When Ao and A, are determined by the conditions e) = 0 and e; = |er| 
under the assumption that e9 = 0, there follows finally 


len| < Cn 
where 


Pee ean ey ey ee Boles| mn —n 
Kh? Bo + it 0 0 0 B? _ 1 (gr ca Bo yi (6.12.18) 
If round-off errors are ignored, we have E < yigh§|y"|mx. Also 


wince By = 1+ WKh + 0(h2), and n = (x, — x0)/h, there follows 
hs : 
© o40K ly"i|max {cosh [~/K (tn — 20)] — 1} 
+ Oa sinh [~/K (an — 20)], (6.12.19) 


when A is small. Whereas |y*'|nax is generally not easy to estimate 
(lirectly, the factor [h4|y’|max]/240 can be estimated as h-?|T'n|max © 
/V¥nlmox/ 18h? or as |V4y%"|max/240. 


Similar but more involved error bounds can be derived in the more 





weneral case. 


In order to illustrate the calculation and to provide a basis for the 


jonsiderations of the following section, we appl ilne’ 
, pply Milne’s method, based 
on (6.12.5) and (6.12.6), to the problem oe 


” 


y’ =a, yO)=0, y'(0) =1, (6.12.20) 
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for which the exact solution is expressible in the form 
y = 31) LG"), (6.12.21) 


where J; is the modified Bessel function of the first kind, of order 3. With 
h = 0.1, the calculation can be arranged as in Table 6.3, if differences are 
used and if five places are retained. 
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TABLE 6.3 
x y y” vy" vPy!” Y 
0.0 0.00000 0.00000 
0.1 0.10001 0.01000 1000 
0.2 0.20013 0.04003 3003 2003 
0.3 0.30068 0.09020 5017 2014 
0.4 0.40214. 0. 16086 7066 2049 
0.5 0.50523 0. 25262 9176 2110 1 
0.6 0.61086 0.36652 11390 2214 -1 
0.7 0.72017 0.50412 13760 2370 -1 
0.8 0.83454. 0.66763 16351 2591 =i 


The first five lines are easily calculated in advance (only three lines are 
needed), the ordinates being determined by use of a single Taylor series, 
and the values of y’’, determined from the equation y’” = xy, and of the 
differences are entered as shown. If (6.12.5) is used to predict ys, the 
prediction is found to be 0.50522; the remainder of the sixth line is then 
filled in, after which (6.12.6) gives the revised value 0.50523, and the 
resultant slight modification in the remainder of the line does not call for 
additional iteration. The value ys = 0.50523 — 0.50522 is then listed 
as +1 unit in the fifth place, and the calculation proceeds in the same way 
in succeeding steps. 

Since the truncation error in each step may be estimated as — 7/18, 
we may be reasonably confident of the calculated values of y to the plac 
retained (except for the usual uncertainty of one unit in the last place, 
due to round-off). In fact, the small values of y may be expected to 
correspond to the effects of round-off. 

Clearly, the alternative forms of (6.12.5) and (6.12.6) may be used 
instead, without the need for calculation of differences. This procedu 
possesses an additional advantage in that then only the entries yn 4, and 
yi/,, are modified in successive steps of each iteration process. However, 
whereas these advantages are of particular significance when large-se 
computing devices are used, so that simplicity in programming and mi 
imization of storage requirements are of prime importance, they m 
compare unfavorably in other cases with the advantages which foll 
from the possibility of detecting gross errors by inspecting the regulari 
of the trend of the difference columns. 
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Further, in place of using (6.12.5) to obtain an initial prediction for 
Yn41, it is possible to estimate the second difference V?y/’,, and to then 
fill in the remainder of that line from right to left through y//,,, after 
which use may be made of (6.12.6) to initiate the prediction. Thus, in 
Table 6.3, a glance at the V*y’’ column would suggest the estimate 
V’yg ~ 0.026 after the calculation of y7. However, this procedure would 
not supply data for the y column. 


It may be noticed that any linear equation of the second order, of the 
form 


Y"” + P(x) Y’ + Q(2)Y = F(a), (6.12.22) 
can be reduced to the form (6.12.1) by the change of variables 
Y(x) = eHPary (zx), (6.12.23) 
in accordance with which (6.12.22) takes the form 
y” + f(x)y = g(x), (6.12.24) 
where 
f(a) = 2{4Q(@) — 2P'(@) — [P@)P}, — g(a) = APF (x). (6.12.25) 


6.13. Change of Interval. In many cases it is desirable to, say, double 
or halve the spacing at a certain stage of the advancing calculation. 
Doubling the spacing presents no difficulties, since it involves only the 
use of alternate values of previously calculated data, together with a 
direct calculation of modified differences relevant to the new spacing, if 
differences are used. 

Thus, in illustration, the smallness of the entries in the y column of 
‘Table 6.3 suggests that the same accuracy may be obtained with a 
doubled spacing h’ = 2h = 0.2. In fact, reference to the error expres- 
sion in (6.12.6) shows that the truncation error in each step can be 
otimated roughly by 


T, ~ — 2 VY ~ —0.00004V4y", 


and it is found that, for the data of Table 6.3, V4y’”’ varies from 0.00024 
(0 0.00065, so that the largest single truncation error in the range covered 
in probably less than about three units in the eighth decimal place. 
Doubling the spacing h will multiply the truncation error by a factor of 
(he order 2° and hence may be expected to lead to a truncation error of 
lows than about one unit in the sixth place in each step. The calcula- 


tion following the work of Table 6.3, with doubled spacing, is given in 
Table 6.4, 
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TABLE 6.4 

x y y” vy" v2y"’ y 
0.0 0.00000 0.00000 
0.2 0.20013 0.04003 4003 
0.4 0.40214 0.16086 12083 8080 
0.6 0.61086 0.36652 20566 8483 
0.8 0.83454 0.66763 30111 9545 
1.0 1.08531 1.08531 41768 11657 1 
1.2 1.38000 1.65600 57069 15301 3 
1.4 1.74164 2.43830 78230 21161 7 































After three lines of calculation, the y column serves a warning that the 
truncation error per step may have increased at that stage to about one- 
half unit in the last place retained. Thus (as might have been anticipated 
in advance from the increasing rate of growth of V’y’’) the advantages of 
the more rapid calculation were short-lived, and the doubling of the 
spacing was ill-advised in the present case. However, the results of 
Table 6.4 may serve to illustrate the somewhat more complicated transi- 
tion to a halved spacing. t 

In the present analysis, knowledge of y(1.3) would permit the deter- 
mination of y’’(1.3) and, consequently, Vy'’(1.4) and V’y’’(1.4), relative 
to the new spacing h = 0.1. Then an iteration, based on (6.12.6), could 
be initiated by estimating V?y’’(1.5) and proceding as was outlined in the 
preceding section. The value of y(1.3) could be obtained by an inter- 
polation involving certain of the available calculated ordinates. Clearly, 
care should be taken to obtain this ordinate to the same degree of accuracy 
as the other ordinates. The use of a difference formula, for this purpose 
would entail the calculation of differences of the ordinates themselves 
but would be desirable, in order that the accuracy of the interpolatio 
could be estimated. 

Another procedure consists in using the formulas derived in §5.7 t 
transform the tabulated differences Vy’’(1.4) and Vy’’(1.4) to correspond 
ing differences relative to the halved spacing. The ordinate y(1.3) ca 
then be determined by rewriting (6.12.6) in the form 


h? 
yn ~ Bats + Ya) — & Yer — Von treV7Ynia)- (6.13.1 


If the difference operators corresponding to the halved spacing 


+ An obvious alternative consists in merely retaining additional differences in 
relevant integration formulas (6.12.3) and (6.12.4). In the present case, however, 
is assumed that retention of the advantages of the special formulas (6.12.5) 
(6.12.6) is considered to be desirable, 
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denoted by V’ and V’, Eq. (5.7.4) yields the formulas 
Vo = 5V + gV7 + eV? + rheV! + gigV5 + 28 + 88VI Le 
a (6.13.2) 
V’? = ZV? + gV? + gaxV! + rigV5 + BEV! + 8807 + +++, (6.13.3) 


when A is replaced by —V in (5.7.4), in accordance with the results of 
§5.7 . The use of these formulas permits the calculation of the differences 
relative to c = 1.4 in the third line of Table 6.5, after which Vy’’ and y" 
are obtained in line two and y’’ in line one. 





TABLE 6.5 
xz y y” Vy” V2y’’ Y 
1.2 1.38000 1.65600 
1.3 1.55071 2.01592 35992 
1.4 1.74164 2.43830 42238 6246 


A useful check on the accuracy of the modified differences is then afforded 
by a comparison of the value of y’’(1.2) so obtained with that previously 
obtained in the direct calculation of Table 6.4. 

The ordinate y(1.3) is next calculated from (6.13.1). Then if, say, 


V*y'"(1.5) is estimated approximately as being equal to V*y’’(1.4), the 
line 


1.5 | 1.95701 | 2.92314 | 48484 6246 | (6.13.4) 


is obtained (from right to left), the first approximation to y(1.5) being 
obtained by use of (6.12.6). When y’’(1.5) and its differences are recal- 
culated (from left to right), the next approximation to y(1.5) is obtained 
from (6.12.6) as 1.95702, and the final form of this line of the calculation 
reads as follows: 


1.5 | 1.95702 | 2.93553 | 49723 7487 | (6.13.5) 


Sufficient data are now available for the use of (6.12.5) as a predictor in 
(he next step, if this is desired, after which entries in the y column are 
‘iain calculable. 

Appropriate modifications of this procedure are easily devised in other 
CHAOS, 

6.14. Use of Higher Derivatives. It is possible to derive a variety of 
formulas, for the numerical integration of differential equations, which 
- involve values of certain higher derivatives of the unknown func- 
tion, 


In particular, the Euler-Maclaurin sum formula (5.9.1) can be ex- 
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pressed in the form 


ge ie ee 
Ynti — Yn—p = hYn41 +y +t Yate: ra Yn—p+2 + Vaart + Yn—p) 
: i iv 
Cues — whl) + fay Wis — VE») 


hs vi 
~ 39240 Yriet 
where the error committed by truncation with the term of order h** is 
(p + 1)h times the term of order h*+? with the contents of the relevant 
parentheses replaced by y**#(£), where arp < § < %n41. Thus, for 
example, with p = 0 we have the formula 


= 9h) bee ey 141) 


h ee ee 
Yop = Yn + 3 You +4) = 12 Ghia AY + 720 y(é), (6.14.2) 
of closed type, which may be used with any convenient predictor formula 


(preferably also with an error of order h®) as in the methods discussed 
reviously. 

Formula (6.14.2) can be obtained also as a special case of the so-called 

Hermite interpolation formula, to be discussed in §8.2, and can also be 

derived by a method of undetermined coefficients, in which we write 


Ynt1 = Yn + hang + ayn) + h?(BoY yr + Buys) + £, 


so that E = 0 if y(x) is a constant, and determine ao, a1, Bo, and 6, in 
such a way that E = 0 also when y(x) = 2, x, a’, and v, and hence for 
any polynomial of degree four or less. For this purpose, it is convenient 
and nonrestrictive to take h = 1 and x, = 0, so that the relevant equa- 
tions become 


a ta,=1, 2ao + 2(Bo + Bi) = 1, 3a0 + 660 = 1, 
Aag + 1285 = 1, 


and yield ay = a1 = 7and By) = —B: = —rz, inaccordance with (6.14.2). 
The error term can then be determined by the methods of §§5.11 and 5.12, 
if the formula is first rewritten in the equivalent form 


[77 ds = 3 O +/O1 + BU'O — FOI +B, 
where f(s) = y(an + sh). 
Reference to §5.12 then gives 


E = L a(s)f’[0,0,1,1,8] ds, 1(s) = 3°(s es 1)2, 
and, since 1(s) does not change sign, there follows 
1 
Er: ei w(s) ds = rh0f"(n) = rioh y"(an + mh) = rah y"(é), 
L Jo 


where 0 < 7 < 1, and hence a < § < Anya 
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In the same way, a formula of open type, involving only yn41, Yn—1, 
Yn> Yn—1» Yn, and y/’ ,, can be obtained in the form 


2h? ” ” 2h> 
Ynti = Yn + Qhyh_1 + 37 (2y/, + yi 4) 4+ ag y"(é), (6.14.3) 


and can be used as a predictor in connection with (6.14.2). 

Since (6.14.2) affords an accuracy which is generally better than that 
associated with the result of retaining fourth-order terms in the Taylor 
expansion 


— jg el nh mo 4 Bt w4 v(£) 
Ynti = Yn Yn 9 Yn 6 Yn 24 Yn 120 Yy ; 





it is often useful in starting the solution when a procedure of fourth order 
is appropriate and when the calculation of values of y’”” and y’ is to be 


avoided. The formula y; ~ yo + hy) + hy)! can be used for a predic- 
tion in the first step, after which (6.14.3) is available. 

When the differential equation is of second order, Eqs. (6.14.2) and 
(6.14.3) are to be supplemented by the two equations obtained by replac- 
ing y by u, where u = y’. Formulas of higher-order accuracy may be 
obtained if derivatives of order three or more are also employed. 

A useful class of formulas, associated with the name of Obrechkoff, can 
be derived by an inverse method in which we first seek a formula for 


f ; $(x) dx with an error expressed in the form 


h 
oe a i a(n — h)ro2(x) de, (6.14.4) 


where r is an arbitrarily prescribed integer. If we integrate by parts 7 
limes, there follows immediately 


h r 
B= gif, #00 gle — 01 


since the integrated terms vanish at both limits, and r additional inte- 
yrations by parts yield the result 


' : ! : wi 2r— BDI, 
b= [62 a - end 9 CoBr Hee) 
+ (—1)F"'¢*-9(0)] (6.14.5) 


which supplies the required formula after a transposition. 
If we write o() = y’(), and translate the origin to Xn, the result 








1 
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takes the form 


2r — k)the kU ke 
eee, oye 1 Sia yi lth + (-DF wel + E 


(6.14.6) 


where, after an application of the second mean-value theorem, (6.14.4) 
becomes 


= OO Pare — yas = (I | EY vere, 
_ 2r + 1{(2r)! 
we ; _ (6.14.7) 


with tn < & < Xn41. 
When r = 2, this result becomes identical with (6.14.2). However, 


when r = 3 we obtain the formula 


vit yi 


h ’ h? ” 
Ynti = Yn + 9 (Yous + Yn) ~ 10 Yroa — Yn) + 5p Ls = (yttt, + 
h' 


— qo0800 Yt) (6.14.8) 


which possesses obvious advantages (in general) over the corresponding 
formula 


h , / a a Saas nr) + ht (yix = yv) 
Ynti = Yn + 2 (Yout + Yn) 7: 12 Yn41 Un 720 n+1 n 
hi es 
—~ 39240 y"(€), (6.14.9) 
obtained from (6.14.1). An appropriate predictor formula can be 
obtained in the form 


Ynt1 = Yn-1 + 2h(4y), fe soe oa * (Bus iM + TYn1 
13h" 
Ab yt wit 14.1 


An infinite variety of other formulas can be derived by employing data — 


relevant to more than two points for calculation, and to more than three 
points for prediction. Thus, for example, the three-point formula of 
highest precision, using first od second derivatives, is readily found to be 


3h,, 
Ynt1 — 2Yn + Yn = 8 Yat — Yn—1) — x fart. a By + Yaa) 


NW vill 
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6.15. A Simple Runge-Kutta Method. The methods associated with 
the names of Runge, Kutta, and others, as applied to the numerical solu- 
tion of the problem 


y’ = F(ax,y), —-y(®o) = yo, (6.15.1) 


effectively replace the result of truncating a Taylor-series expansion of 
the form 


h2 hi 
Yntt = Yn + hy, + a ye + 6 Yn tees (6.15.2) 


by an approximation in which y,,; is calculated from a formula of the 
type 


Ynti = Yn + hlaoF (2n,Yn) + ak (tn + bah, Yn + bih) 
+ ak (en + poh, Yn + beh) + ee + al (an + Myh, Yn + byh)]. 
(6.15.3) 


Here the a’s, y’s, and b’s are so determined that, if the right-hand member 
of (6.15.3) were expanded in powers of the spacing h, the coefficients of a 
certain number of the leading terms would agree with the corresponding 
coefficients in (6.15.2). 

They possess the advantages that they are self-starting but do not 
require the evaluation of derivatives of F (x,y) and hence can be used 
(even at the beginning of the solution) when F(z,y) is not given by an 
wnalytical expression, and also that a change in spacing is easily effected 
ut any intermediate stage of the calculation. On the other hand, each 
step involves several evaluations of F(z,y), which may be excessively 
laborious in hand computation, and also the estimation of errors is less 
simply accomplished than in the previously described methods, 

It is convenient, in order both to simplify the derivation and also to 
systematize the formulation, to express each of the b’s in (6.15.3) as a 
linear combination of the preceding values of F. Thus, in place of using 
the notation of (6.15.3), it is desirable to write the approximation in the 
form 

Yn4+1 = Yn -- aoko +. ayky; + 2 3.6 + ak, (6.15.4) 
where ky = AF (an,yn), 
ky = AF (an + wah, yn + Aroko), 
ke = AF (an + oh, Yn f A2oko oe Noiky), (6.15.5) 


2 -op hF (an + Myh, Yn + Apoko + Aniki tere t Xp,p—1kp—1), 


und where the coefficients a;, 4;, and \,; are to be determined. 

Since the actual derivation of such formulas involves considerable 
Algebraic manipulation, we consider in detail only the very simple case 
yp 1, which may serve to illustrate the procedure in the more general 
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case. Thus, writing » for wi and A for Aro, we proceed to determine ao, 


a, w, and » such that 
Ynt1 = Yn + aoko + aiky, (6.15.6) 
where ky = AF (2n,Yn), ky = hF (an + wh, yn + Ako), (6.15.7) 


possesses an expansion in powers of h whose leading terms agree, in so 
far as is possible, with the leading terms of (6.15.2). 
We first obtain the expansion 


ALF + (uhF, + MoFy) + eh?P rx + QuAhKkoF a + AK yy) + O(h>)] 


hF + h%(uF'. + XFF,) + us (uF ex + 2udF Pay + MFF,,) + O(h'), 
(6.15.8) 


key 


where F = F(anyn), Fs = Fr(tnjyn), and so forth. Hence (6.15.6) 
becomes 


Ynti = Yn + Aloo + on) + ha(ul, + FF,) 
i re ca (u%Pee + 2uFPx, + MEF yy) + O(h4). (6.15.9) 
On the other hand, with the same abbreviated notation, we obtain 
from (6.15.1) the relations 


i "= FL, + FF,, m — WL + QPF, + PF + F,(F. + FF,), 
y F, y a uw Y y vy (6.15.10) 


so that (6.15.2) becomes 
h? 
Yas = Yn + AF + 5 (Ps + FF) 
4 + Fay + PP + Fy(Ps + FR) + 00H). (6:15.11) 


Thus, if we identify the coefficients of AF, WF, and h?FF,, in (6.15.9) 
and (6.15.11), we obtain the three conditions 


ao + ai = 1, pay = 4, har = 3%, (6.15.12) 

involving the four adjustable parameters, which are satisfied if and only if 
1 1 
a=l-e, ai= ¢, Bs 9? = 57 


where c is an arbitrary nonzero constant. The expansion (6.15.9) then 
reduces to 


vert = yo the + © (Fe + FF) + B Pee + FP + FF yw) + O(h), 


(6.15.13) — 
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and reference to (6.15.10) shows that (6.15.13) or, equivalently, (6.15.6) 
would then be brought into agreement with (6.15.11) or (6.15.2) if a 
truncation-error term of the form 


ens (e = *) [Fee + 2F Fey + FF y) + F,(P2 + FF,)| 
+ FR. + FF,) + O(h‘) 


3 
a T, = — pe (3 — 4e)yn’ — 8Fy(%n,Yn)Yn] + O(h*) (6.15.14) 


were added to its right-hand member. 

The remaining free parameter c clearly cannot be determined so that 
7’, is of order h*, except in trivial special cases. A convenient choice is 
c = %, in which case the second abscissa involved in (6.15.6) and (6.15.7) 
iS tn41, and the formula becomes 


Ynti = Yn + F(ko + ki) + Tr, (6.15.15) 

where ko = AF (an,Yn); ky = AF (an + hy, Yn + ko), (6.15.16) 
3 

miehie:, oes a yl" — BF, (emyndyl!] + 0(M4). (6.15.17) 


If, for all values of x and y involved in the calculation, it is known that 


IF, (x,y)| S K, (6.15.18) 


then, as in earlier developments, it is readily shown that the propagated 
error e, in the nth step is dominated by the solution of the difference 


equation 
Cn41 = on + ee, + (en + hKen) +H 
. 2K2 
or enw = (1 +hK + _ ) en + E, (6.15.19) 


h?K? 
where eo = 0, E = |Ta + Ralmax, hK + ae 1. (6.15.20) 


lurther, it can be shown that (6.15.17) can be replaced by 
h® , Ys 
T,=—-— i) [y’’"(&1) — BF y(tn41,0)y”’ (é)I, (6.15.21) 
where & and & are intermediate between x, and 2n41, and 7 between 
Woy, and yn + hy}. Thus, if the round-off error R, is ignored, and if 
\y’’ (x) Ss M2, ly’’”’(x)| s M;, (6.15.22) 
\( follows after a simple calculation that 


h?(M, + 8KM2) h?K?\" 
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The formula (6.15.15), using (6.15.16), is of limited accuracy. Indeed, 
it can be considered to be a modification of the result of retaining only 
the first difference in (6.3.2), 


Yori = Y phy + Yau) — Ry’ 
n+1 n 9 n n+1 12 ’ 


(6.15.24) 
in which the unknown derivative y/,,, = F(@n+1,Yn+1) 18 replaced by the 
approximation y/.4, ~ F(%n41, yn + hyn): This consideration is useful in — 
deriving (6.15.21). The details of the analysis were presented here prin- 
cipally to illustrate the similar but more complicated analysis relevant to 
formulas of higher-order accuracy, certain of which are listed in the 
following section. 

It is of some importance to notice that the error (6.15.21), associated 
with (6.15.15) and (6.15.16), depends upon the form of the function 
F(zx,y) as well as upon the solution y itself. This situation is character- 
istic of formulas of the Runge-Kutta type. For example, whereas the 
equations y’ = 2(x + 1) and y’ = 2y/(x + 1) both define the function 

= (z + 1)? when the condition y(0) = 1 is imposed, the formula 
(6.15.15) and (6.15.16) would yield this solution exactly when applied to 
the first equation, if no round-offs were committed, but would not do so 
when applied to the second form. On the other hand, the formula 
(6.15.24) would yield exact results when applied to either form, or to any 
other first-order equation whose required solution is a polynomial of 
degree two or less (see also Milne [154, 155]). 

‘At the same time, the mere fact that (6.15.15), with (6.15.16), does not 
have this last property does not imply that its interpretation as a weak= 
ened modification of (6.15.24) is proper in the more general case when thi 
true solution is not such a polynomial. For example, it is easily see 
that the use of (6.15.15) and (6.15.16) would yield exact results whei 
applied to the problem y’ = —y/(x + 1), y(0) = 1, for which the solutio 
is y = 1/(t + 1), whereas the use of (6.15.24) would lead only to a 
approximation. 

6.16. Runge-Kutta Methods of Higher Order. When ko, ki, and 
are employed in (6.15.4), corresponding to p = 2, it is found that t 
requirement that the expansion of the right-hand member be corr 
through h* terms imposes only six conditions on the eight arbitrary 
parameters involved, so that a doubly infinite set of such formulas wi 
third-order accuracy can be obtained. One such formula, due to Kut 
is of the form 
Yarr = Yn + 4(ko + 4k: + ke) + OCA‘), 

ko = AF (n,Yn); 

k= LF (an + gh, Yn - oko), 

key = AE (aa te hy Yn + 2ky = Iho). 


(6.16, 
where 


(6,16. 
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A second, due to Heun, is of the form 
Ynui = Yn + F(ko + 3k:2) + O(h‘), (6.16.3) 
where ko = AF (2n,Yn), 
ky = hF (an + gh, yn + gh), (6.16.4) 


ke = AF (an + Hh, Yn + Hk). 


These two formulas are generally of about equal accuracy, with each 
possessing certain obvious computational advantages. Kutta’s form is 
seen to be analogous to the formula of Simpson’s rule and would reduce 
to that formula if F were independent of y. 

It is also possible to derive a two-parameter family of formulas of 
fourth-order accuracy, by retaining an additional k in (6.15.4). 


The 
simplest such formula, due to Kutta, is of the form 
Ynti = Yn + a(ko + 2k, + 2ke + ks) + O(h5), (6.16.5) 
where ko = AF (Xn,Yn), 
ky = WF (en + gh, yn + 3ho), | 
ka = AE (&s + 3h, va + 3h), en) 


ks = RF (ata +h, yn + ke), 


and would also reduce to Simpson’s rule if F were independent of y. 


Such formulas can also be generalized to the treatment of stmultaneous 
equations of the form 


oY = F(ey,), 
ay (6.16.7) 
dz a G(x,y,u), 


where y and wu are prescribed when x = 2». 


In particular, the preceding 
formula generalizes as follows: 


Yn41 = Yn + a(ko + 2k; + 2ke + k3) + Oh), 


digi og 4H A Os my eae EO, 

where ko = hF (2n,YnjUn), 

ky = AF (an + gh, yn + ako, Un + 3mMo), 

ka = RF (tn + 4h, ya + Hea, ttn + 4m), a 

kg = AF (an +h, yn + ke, Un + me), 
and Mo = hG(Ln,Yn,Un); 

m= hG (an + sh, Yn + 3Ko, Un + 3M), 

me = hG(xn + gh, yn + oki, Un + 91), MSG”) 


ms = hG (an + h, Yn + ke, Un + M2). 


A consideration of this form indicates the way in which other formulas 
are so generalized, 
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In particular, when F = u, so that (6.16.7) is equivalent to 


a’y 


qe YY’), (6.16.11) 
xe 


y’, (6.16.9) gives 


Il 


with u 


h rh = hay! 
ko = hy}, ky = hy', + 3 Mo, ke = hy}, + 9 mi, kg = hy, + hme, 


and hence (6.16.8) and (6.16.10) reduce to 


Ynut = Yn + hy, + ; (mo +m + m2) + OCA), 
Your = yh + elmo + 2m, + 22 + ms) + Oh’), 


Mo = AG (Ans YnsYn)s 

m= AG (an + gh, Yn + shy, 45, + Mo), 

Me = hG(an + gh, Yn + ahy, = qhmo, Un + M1), 
m3 = hG (an + h, Un + hy + hm, Yn + M2). 


The use of this formula is clearly simplified in those cases when G is 
i ndent of y’. 
great and generalizations of these formulas are present in 
the literature, some of which afford certain computational advantages in 
certain situations. 

One such modification, due to Gill [94], is of particular usefulness wher 
the computation is to be effected by large-scale calculators, in which thi 
storage of data is to be minimized. 

No simple expressions are known for the precise truncation errors 1 
the preceding formulas. An estimate of the error can be obtained, 1 
practice, in the following way. Let the truncation error associated wi 
a formula of rth-order accuracy, in progressing from the ordinate at t, 
that at 2.41 = tn + h, inasingle step, be denoted by C,h"*?, and supp 
that C, varies slowly with n and is nearly independent of h when h 
small. Then if the true ordinate at rn41 is denoted by Ynyi, the valu 
obtained by two steps starting at tn_1 by y},, and the value obtained b 
a single step with doubled spacing 2h by y(t, there follows approximate] 


hy) ws r+1 
Youi — yt ~ 2C,h, 

2h) ~ Ortl r+1 
Yosi — yeh © 2° Ch, 


(6.16.12) 


where 


(6.16.13) 


(6.16.14) 


when h is small. The result of eliminating C, from these approxima 
relations is then the extrapolation formula} 
yea — yer, 
Yay = yy oR nm = 7 


+ This is another example of so-called Richardson extrapolation (seo §3.6), 


(6.16.1 
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Thus if, at certain stages of the advancing calculation, the newly cal- 
culated ordinate yn41 is recomputed from y,_; with a doubled spacing, 
the truncation error in the originally calculated value is approximated 
by the result of dividing the difference between the two values by the 
factor 2” — 1, that is, by 3 in (6.15.15), by 7 in (6.16.1) or (6.16.3), and 
by 15 in the formulas of fourth-order accuracy. 

It is apparent that an arbitrary change in spacing can be introduced 
at any stage of the forward progress, when a method of the Runge-Kutta 
type is used, without introducing any appreciable complication. 

6.17. Boundary-value Problems. Problems in which the conditions 
to be satisfied by the solution of a differential equation, of order two or 
greater, are specified at both ends of an interval in which the solution is 
required are known as boundary-value problems and are generally much 
less amenable to numerical analysis than are initial-value problems, in 
which all conditions are imposed at one point. In this section, we con- 
sider briefly the application of certain elementary methods to the numer- 
ical solution of such problems. More efficient methods can often be based 
pon the result of reformulating the problem as an integral equation or as 
a problem in the calculus of variations, the treatment of both of which 
falls outside the scope of this work. 


lor a linear problem, such as one governed by a second-order equation 


of the form 
y" + P(a)y’ + Q@)y =F(e) (<2 <b), (6.17.1) 
and by the end conditions 
y(a)= A, —-y(b) = B, (6.17.2) 


where A and B are prescribed, the analysis can be based on the principle 
of superposition. Thus, if u(x) is any solution of the equation 


wu’ + Pu’ + Qu=F (6.17.3) 
which satisfies the initial condition 
u(a) = A, (6.17.4) 
vnd v(x) is any nontrivial solution of the equation 
v’ + Po’ + Qv =0 (6.17.5) 
which satisfies the initial condition 
v(a) = 0, (6.17.6) 
then the function 
y(v) = u(x) + ev(x) (6.17.7) 


wilisfies (6.17.1) and the condition y(a) = A for any constant value of ¢. 
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Further, if P, Q, and F are continuous in (a,b), there cannot exist addi- 
tional functions having this property. Thus, a solution is found if c can 
be determined such that 


u(b) + cv(b) = B. (6.17.8) 


Unless c can be so determined, no solution exists. 

If P, Q, and F are continuous in (a,b), the initial slopes u’(a) and v'(a) 
can be chosen arbitrarily, so long as v’(a) # 0; the choices u/(a) = 0 and 
v'(a) = 1 are frequently convenient. It may be noticed that, if u(b) = B 
and v(b) = 0, then c is arbitrary and infinitely many solutions exist; if 
u(b) ¥ B and v(b) = 0, then no solution exists. Unless it happens that 
v(b) = 0, the solution exists uniquely. Any of the previously discussed 
methods can be used in determining wu and ». 

In the case of a corresponding nonlinear problem, such as that governed 
by an equation of the more general form 


y" = G(a,y,y’) (a<x<b) (6.17.9) 


and the end conditions 


ya) =A, = -y(0) = B, 


superposition generally is not valid. One possible procedure consists in 
defining u(x,«) as the solution of the initial-value problem 


wl! = G(a,u,u’), 


(6.17.10) 


roe ula) See (6.17.11) 
and attempting to determine a such that 
u(b,a) = B. (6.17.12) 


For this purpose, u(b) could be determined for two or more trial values of a. 
Then, by linear (or higher-order) inverse interpolation, an “improved” 
value of a would be obtained, and the process would be iterated until 
(6.17.12) is satisfactorily approximated, if the iteration converges. 

The process is apt to be tedious, and is complicated by the fact that 
small changes in a do not necessarily correspond to small changes in 
u(b,«). Further, the basic questions of existence and uniqueness of the 
solution are particularly troublesome in themselves, in the general non 
linear case. There exists no completely satisfactory general method 
(numerical or otherwise) for dealing with such problems. 

Problems in which the end conditions prescribe y’ or a linear combing 
tion of y and y’, or which are expressed in a more complicated way 
involve more or less obvious modifications. 

Another class of methods, which is usually convenient only when ti 
problem is linear, consists in approximating the differential equation b 
@ difference equation and in solving the simullancous set of algebrat 


t 
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equations resulting from the requirement that this equation be satisfied 
at each of a set of equally spaced points in the relevant interval.t 
In illustration, any linear second-order equation can be transformed 


to an equation of the form 
y’ +fa)y = g(), (6.17.13) 


as was pointed out at the end of §6.12. Reference to Eq. (5.6.7) yields 
the relation 


Ynsi — 2Yn + Yn = W201 + yd?) yl? + Tr, (6.17.14) 
where the truncation error 7’, is expressible in the form 
hé 
T, = — 340 y'(&) (Gna < E < X41). (6.17.15) 


Thus, if we use (6.17.13) to replace y}’ by gn — fnYn, (6.17.14) takes 
the form 


h2 5h? : 
(1 + i fous) Ynti — 2 (1 a a fn) Yn + (1 + ¥ je) Yn—-1 


h? 
= {2 (Gn+41 + 109n + Jn—1) + 7, (6.17.16) 


Now, if the interval (a,b) is divided into N + 1 equal parts, in such a 
way that 2 =a, % =a+t+h,--::, ay=a+WNh, zn4i be where 
h = (6 — a)/(N +1), we may require the result of ianoriig T, in 
(6.17.16) to hold form = 1,2, . . . , N, and so obtain a set of N simulta- 
neous linear algebraic equations in yi, y2, . . . , yn, of the form 


h? 5h? 2 
( + i fj) yo — (1 =~ eh) Yi ( + ef) yz = ¥ Gy 


h? 5h? h? 
( + if) yi — 2(1 _ 5m i) Y2 + ( +n) Ys = ts 


ed 
| 


: h? 5h? h? 2 

(; t is tv) yn-1 — 2 ( = Ms) yn + ( + ¥ fuss) Yn = F ay, 
(6.17.17) 

where G(x) = g(x +h) + 10g(x) + g(a — h), supplemented by the pre- 

soribed conditions yo = A and yyii = B. 

In virtue of (6.17.15), this procedure is of fifth order, in the sense that 
it would afford exact results if y(2) were a polynomial of degree five or 
lows, A simpler procedure, of third order, corresponds to the neglect of 
second differences in (6.17.14) and hence consists in solving the simulta- 


| Such methods are often preferred to step-by-step methods since their use does not 
involve stepwise propagation of errors, 
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neous equations 


(n = 1, 2, eset is ,N), 
(6.17.18) 


h? aH 
Yuet — 2(1 ae fn) Yn + Yn = hgn 


supplemented by the boundary conditions. 
On the other hand, if fourth differences are retained in (5.6.7), the 
corresponding equations are easily obtained in the form 


97h? 
—— 240 fn) Yn 


h? h® 
+ (: + of Yn-1 — 340 Jnr? 


2 h? 
_- Jy fastest + (1 + H fura) Yn+1 — 2 ( 


h? 


= 940 (—gnt2 + 24gn41 + 194g, + 24gn—1 — Jn—2)s (6.17.19) 


and would reduce to identities forn = 2,3, ...-.-, N — lif ve) ria 
polynomial of degree seven or less. For n = landn = N, Eq. ‘ 
would involve the irrelevant quantities y—1 and yn+2. Two addi 7 

“off-center” relations, which would also be satisfied exactly by any po a 
nomial solution of degree seven or less, are thus needed. ec saa 7 
obtained, for example, by retaining fifth differences in the bac 7 - 
difference formula (5.5.11), relative to Vyn+ty and in the corresponding 
forward-difference formula relative to A?yo, in the formst 


2 h? 
(1455 4) m—2(1 —"'s) n+ (0+ a) 


Th h? ht 
+ 59 favs — go fue + 240 /8¥s 


h2 


240 


(18g0 + 20991 + 4g2 + 1493 — 694 + gs) (6.17.20) 


and 
. he? Th? ( in 4) m 
fxs — 49 fx-ayN—s + 799 fry FAL + apie) yea 


2 3h? 
2 ( — 200M) yn + ( + BM fst) YN41 


{ 
mee (gv—4 — 6gn—s + 14gn—2 + 4gw—1 + 2099w + 18gy41)- (6.17.21) 
240 


+ The same relations can be obtained by using the approximate relations 
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The NV — 2 equations (6.17.19), the two equations (6.17.20) and (6.17.21), 
and the two prescribed end conditions serve to determine (approximately) 
the values of the N + 2 ordinates yo, yi, -,» YN, Yny1. Here the 
interval (a,b) must be divided into at least five equal parts. 

If the prescribed end condition at x = a = 2» involves y’(a) in place 
of (or in combination with) y(a), that condition can be replaced by an 
appropriate approximate one, involving yo and ¥;, by means of the result 


of retaining terms through h” (where n is the order of the procedure used) 
in the expansion 


1 1 | 
Ayo = (iw + 5 WD? -|- 31 h®D2 + Zi h*D4 + ae ‘ Yo, 
combined with (6.17.13), to give 
/ hr? h’ / yo 
Wi — Yo = hy + a (go — foyo) + 6 (90 — foyo — foys) 


ht / , 
+ 5y (90° — fo'yo — 2foyo — fogo + fio) + ++ 


h? WP ag HE og, BY i / 
n= ye [ - gf Ea ec — 34 Gy — fi) - 120 (fo’ — 4fofo) — + ° | 
hs ht ., he ” 
+ [a— ep — ey — cap =p) rts | 


h? RE 3. ARE 305, 
+ E Jo + 6 Jo + ay (9 — fogo) 





hé , , 
+ 120 (g¢”” <= 3f 590 = fogs) ate ans + (6.17.22) 


A similar relation, for use at x = b = 2yy1, is obtainable from the 
expansion 


1 1 1 
Vynu1 = (x ein 31 h?D?2 + aT h?D3 — ral hVéDi+-.: ‘ Ynp. 


6,18. Linear Characteristic-value Problems. When g(x) =0 in 
(017,13), and the prescribed end conditions are of the special form 
via) = y(b) = 0, one solution of the problem is clearly the ‘trivial solu- 
tion” yw) =0. It frequently happens that an arbitrary constant 


parameter is linearly involved in the definition of the function f(x) and 





that it is then desired to determine values of \ for which the problem 
ilso admits a nontrivial solution. Such values of \ are known as its 
Mharacteristic values (or eigenvalues), and the corresponding solutions are 
fulled the characteristic functions of the problem. The study of their 
Properties and applications comprises an important field of mathematics, 








ty”, = A8gar — fy) = 0, Vnys = V8COwsa — Srvavnes) = 0 


to eliminate the ordinates y—1 and yyy. from the equations which correspond to sett 
n= landn = N in (6.17.19), 
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and a great variety of methods have been (and are being) devised for 
their approximate numerical treatment. 

One such method can be based on the result of appropriately specializ~ 
ing (6.17.17). Thus, if the problem is of the form 


y” + (a(x) + Ar@)ly = 9, 
y(a) = y(b) = 9, 


we may replace f by q + Ar in (6.17.17 ), to obtain a set of N equations of 
5h? 


the form 
5h? h? h? 
— oF a.) _ aH | yi + I + va) + Bar| y2 = 0, 


2 ( 
he 5h? 
an 2 (2 . 


he 5h? 
[i +%e)+ ae as) — Ta] 


2 
|. I + vas) +35 


5h? 


-al (280) - 


This set of homogeneous linear equations will admit a nontrivial sol 
tion for y1, y2, - . - » yw if and only if the determinant of the array 
coefficients vanishes (see §10.2), a requirement which demands that 
be a root of an algebraic equation of degree N if no one of the values 
r vanishes, as is generally the case in practice. For each such value of 
this set of equations becomes redundant and (at least) one equation 
be ignored, after which the remaining equations can be solved for 
ratios of certain of the ordinates to the remaining one or ones. Exe 
in unusual cases, only one of the ordinates (but, generally, any one) 
be chosen arbitrarily, and the ratios of the remaining ones to that one 
determinate. In this way, approximations to N of the characte i 
numbers (generally the N smallest ones) of the true problem are obtain 
together with ordinates of the corresponding characteristic functi 
defined within a common arbitrary multiplicative factor. 

The crudest approximation is obtained by taking N = 1, so that o 
the central ordinate y:, at x = (a + b)/2, is involved, and yo = y2 ™ 
Thus only the equation 


alGisemey) 


(6.18.1) 


5h? 
12 rr yn = 0. 


2 
* an | w= 0 
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is obtained, and the requirement y; ~ 0 leads to the approximation 


hq ae 
(1 = as) Nt (: —_ *) 


to the smallest characteristic number )y,. 
indeterminate. 


When N = 2, the two permissible value 
; s of \ are f 
of the determinantal equation aera 


12 


= 5h?r, 








(6.18.3) 


The ordinate y, is then 


7 _ 5h? 5h? h? 2 
{0-Be)- De») [C+ 8a)+ hm] 
he h Bh? 7 
i+ie) +i] -2[(-a)- So 
(6.18.4) 


where h = (b — a)/3, and may be denoted by \%? and AS>. For each 
of these calculated values of \, there follows also, from the first equation 
Y2_» (1 — 5h?q1/12) — (5h2r/12)r 
x1 (1 + h?q2/12) + (h?r2/12)n ’ 


with y; arbitrary. The use of the s i 
' : econd equat, 
equivalent result. pene ae ae 


In illustration, the problem 


Y” + 2Y’ + r»2x¥ =0, 





(6.18.5) 


Y(O) = Y(1) = 0 (6.18.6) 
in transformed to the problem 
y’ —y+rdy=0, y) = y(1) = 0, (6.18.7) 
with the change of variables 
Y = ey, (6.18.8) 
i accordance with (6.12.23). With =— 
(W.18.3) yields ads regen nee 
Ay = SACL + as) = 48 = 21.2, 
liquation (6.18.4) becomes 
a (3% — 5d 321 + 2r 
324 324 


321 +2 


+ _ 9 (399 — 10n\| : 
324 324 
ind expands into the relevant characteristic equation 


22\* — 2367) + 39627 = 0, 




























246 INTRODUCTION TO NUMERICAL ANALYSIS 


to yield a second approximation \{? + 20.74 to A, and a first approxima- 
tion AY = 89.38 to As Equation (6.18.5) becomes 


y2 _ 678 — 10d 


Y1 Zz 321 + 2r 


and yields y2/yi = 1.30 for Ay? and y2/y1 = —0.432 for ay. Thug 
from (6.18.8) there follows Y(})/Y(%) ~ 0.93 in the first “mode” and 
—0.31 in the second one. 

If three interior ordinates were used, to afford improved approximations 
to 1 and Xo, and a first approximation to As, it would be necessary to 
expand a determinant of third order and to determine the roots of a cubie 
equation. Various iterative techniques for determining the roots of the 
relevant characteristic equation without explicitly expanding the deter- 
minant, in such cases, exist in the literature (for example, see Frazer, 
Duncan, and Collar [88]). 

A simpler procedure would be based on the use of (6.17.18), whereas a 
more elaborate procedure could be based on (6.17.19) to (6.17.21). 

Modifications, which are appropriate to situations when a linear com- 
bination of y and y’ is required to vanish at each end of the interval, may 
be based on the use of (6.17.22) and a similar ee nes relevant to *3 b, 
with f = g + dr and g = 0, in place of the conditions yo = yny1 = 0. 

In * caeh special case when g(x) = 0 and r(x) = 1, so that (6.18.1 
reduces to y’” + Ay = 0, where y(a) = y(b) = 0, the exact value of th 
rth characteristic value is easily found to be 

Tar? 
“=a 
The approximation X, afforded by use of the simpler procedure is kno 
to be 


Ay 


i= 2 sin? ¢ vi.) 
whereas that afforded by use of (6.18.2) can be shown to be 


~ _ 4 sin? [(h/2) Vad _ 

* = WT = § sin? [(h/2) Vel 
where h = (b — a)/(N + 1), from which results the nature of the appr 
imations to the first N characteristic numbers can be determined in t 
two cases. In particular, the error in the former case is found to 
positive and less than h?\?/12, whereas that in the latter case is positi ' 
and less than h4)3/240. The error associated with the use of the m 
elaborate seventh-order procedure would be of the order of h*\4, Th 
facts permit crude preliminary estimates of the requisite number of su 
divisions in similar (but less simple) cases, 
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When only the smallest characteristic number, and the associated 
mode, is required, the present methods are objectionable since the deter- 
mination of the Nth approximation essentially entails the formation and 
solution of an algebraic equation of Nth degree (although, as was men- 
tioned above, some of this labor can be avoided by matrix iteration). 
In such cases, the use of a variational method, such as that based on 
Rayleigh’s principle (see Temple and Bickley [223]), is often preferable. 

6.19. Selection of a Method. Whereas a rather large number of 
methods for dealing with initial-value problems have been outlined in 
this chapter, it should be remarked that a substantial number of addi- 
tional variations may also be found in the literature. The problem of 
deciding which one of these methods is most appropriate, in any specific 
situation, is a particularly troublesome one because of the large number of 
factors which may affect the decision. 

First of all, the choice will depend upon the nature of the computational 
device to be used. Thus, for example, a method which is well adapted to 
the use of a desk calculator may be inconvenient for hand calculation 
because of the fact that it involves too many operations with multidigit 
iumbers; or it may be inappropriate for a certain large-scale computing 
device for the reason that each step in the advancing calculation requires 
the availability of more independent data than can be accommodated in 
the storage unit of that device, the execution of operations for which the 
urithmetical unit is not well adapted, or a more involved sequence of 
operations than can be programed in the control unit. These restrictions 
vary so considerably from one device to another that no general discussion 
van be attempted here. 

Again, a procedure which involves a large number of iterations of a 
relatively simple technique may be remarkably well adapted to an auto- 
matic high-speed calculator, but its use may entail a prohibitive amount 
of time when the calculations are to be made by hand. 

A procedure which involves a large number of evaluations of a certain 
function (x,y) may not be objectionable for hand calculation if F(z,y) is 
woll tabulated, but may require an undesirably complicated program in 
nachine calculation. On the other hand, the situation may be com- 
pletely reversed if the function is of complicated analytical form and if a 
foutine is available for generating it directly in the computer. 

Stability considerations may be of great importance, for a given prob- 
lem, when a large number of steps is to be taken, but may be much less 
*gnificant when only a relatively small number of ordinates is required, or 
Whon a different problem is dealt with. The computational advantages 
fesociated with a simple procedure with relatively large truncation error 
Must be weighed against the fact that the use of that procedure generally 
fequires a small spacing and a correspondingly large number of steps, and 


——————_ << 
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hence increases the importance of the effects of round-off errors. At the 
same time, such a procedure may be preferred to a more elaborate one 
when its use permits a fairly confident estimate of an upper bound on the 
total propagated error, whereas such an estimate corresponding to the use 
of the more elaborate procedure is not readily available. 

One may be faced with the problem of choosing the procedure which is 
most appropriate in the solution of a single differential equation, or that 
which appears to be best on the average for a wide class of equations. 

The methods described in this chapter, for advancing the solution, fall 
into three broad classes: (1) methods which express the future ordinate 
as a linear combination of present and/or past ordinates and slopes 
(§§6.5, 6.6, 6.10, 6.12); (2) methods which also involve the calculation of 
certain higher derivatives (§6.14); and (3) methods in which the deter- 
mination of the future ordinate does not involve memory of the past 
(§§6.15, 6.16). 

The Euler procedure and its modification of closed type are the simplest 
of the procedures in the first class. The Adams method and its closed- 
type modification and the Milne methods are the most frequently used 
procedures of higher-order accuracy in this class. In either case, the 
relevant formulas may be expressed either in terms of differences of slop 
or in terms of the slopes themselves. The Milne procedure using a given 
number of slopes is generally of higher-order accuracy than the correspond: 
ing Adams procedure, but it compares unfavorably with the latter fro 
the point of view of stability. Except for the crude Euler method, t 
procedures in this class are not self-starting. 

The methods of the second class are highly efficient when and onl 
when the differential equation is of such a form that analytical relatio 
between higher derivatives of the unknown function and the functio 
itself are readily obtained. If derivatives at only two points are used 
the special methods actually treated are effectively self-starting. 

The Runge-Kutta methods, of the third class, possess the advanta: 
that, since their use at each stage of the advancing calculation does n 
require information relevant to past stages, they are completely se 
starting and are particularly appropriate when memory requirements & 
to be minimized. Furthermore, these procedures are inherently stab 
and are such that a change in spacing is easily effected at any stage of 
advance. The principal disadvantage consists in the fact that each f 
ward step entails several evaluations of the “right-hand member”’ of 
differential equation, a fact which may be of considerable importance 
hand calculation. In addition, the nonexistence of a tractable expressi 
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methods of the second and third classes as well as the methods of §6.4. 
When the right-hand member of the differential equation is of such a form 
that the formation of higher derivatives is readily effected, the use of 
Taylor series (§6.4) or of series which involve values of higher derivatives 
at two points (§6.14) is often convenient. Otherwise, resort may be had 
to one of the iterative methods of §6.4 or to the methods of Runge-Kutta 
type (§§6.15, 6.16). 
6.20. Supplementary References. Texts on the numerical solution of 
ordinary differential equations include Levy and Baggot [132], Collatz 
[58], and Milne [155], each of which presents a useful bibliogtaphy: 
For analytical methods of solution and general theory, see Bateman [32] 
and Ince [120]. Selected journal references to special numerical methods 
are included in the present bibliography. For stability considerations 
relevant to step-by-step methods, see Todd [224], Rutishauser [200] 
Mitchell and Craggs [159], and Collatz [59]. Errors are distuasad 
by von Mises [156], Brouwer [51], Tollmien [225, 226], Turton [228] 
Rademacher [187], Duncan [71], Weissinger [233, 234], Bieberbach (41, 
Mohr [161], Richter [196], Matthieu [144], and Sterne [217]. Reand-ott 
crror bias is treated by Huskey and Hartree [119] and Forsythe [79]. For 
the numerical solution of boundary-value and characteristic-value prob- 
lems governed by ordinary differential equations, see Collatz [56] 
Nystrém [170], Temple [222], and Fox [83]. Sampling (‘‘ Monte Carlo”) 
methods, described, for example, in Householder, Forsythe, and Germond 
|118] and by Curtiss [66], are intended primarily for the solution of more 


involved problems, but are also applicable, in principle, to ordinary dif- 
ferential equations. 


PROBLEMS 
Section 6.2 


1, Show that the operator affecting y’, in the open formula (6.2.12) relating yn41 and 
/»-» Can be obtained by multiplying the one corresponding to p = 0 in (6.2.10) by 
1-(Q—-—v)et (p + 1)p 
— eS sl" fia i * aco) Riera Ses 
and use this method to derive (6.2.13) to (6.2.15), as well as the formulas corresponding 
lop = 2and p = 4. 
2. Verify that the results of terminating (6.2.13) to (6.2.15) with the zeroth, second, 
and fourth differences, respectively, are Newton-Cotes formulas of open type. 


Nection 6.8 


4. Show that the method used in Prob. 1 also applies to the closed formulas, and 







for the associated truncation error is a source of some inconvenience. 
For the purpose of starting a solution, when a method of the first cl 
is to be used for advancing the solution, one may choose among 


Hlorive the formulas relating yn41 and yn—, from (6.3.2) for p = 1, 2, 3, 4, and 5 in this 
way. 


4. Verify that the results of terminating (6.3.3) to (6.3.5) with the second, fourth, 
ancl aixth differences, respectively, are Newton-Cotes formulas of closed type. 
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if no round-offs were effected, and that corresponding to retention of first differ- 
: ences in (6.3.3) would be of the form 
Section 6. 
; we ing toc = +0.1 and +0.2, for each 1 ee 
5. Obtain additional values of y, corresponding : : (2) S +h Tua — h)* 
of the following problems, by use of power series, rounding the results to five decimal Yn 2V1 +R (Vi th +h + a(vi+h ) 
places: she + (-D(VT FB hh (VT FF )M, 
‘+y =0,y0) = 1. 
‘ y of aH = 2x3, y(0) = 0. where y; is the independently calculated approximation to the true value e~*, whereas 
(c) y ty +2y? = 0, yO) = 1. the exact solution is given by yn = e—™. 
(d) ay’ =1—y + 2%’, y(0) = 1. = d 15. Show that, when / is small, the solutions obtained in Prob. 14 can be expressed 
[The respective analytical solutions are e~*, e*? — 1 + 2%, (2e7 — 1 — =)", om in the forms r 
yD = (eo — poh? + - +.) 
t t. ; h 
. a is (d) Proceed as in Prob. 5 by Picard’s meres and 
i . 4.14). 
P to (d) Proceed as in Prob. 5 by use of Eqs. (6.4. ‘ 
. eae aoe starting values of y when x = 0.1, 0.2, 0.3, and 0.4 for the y? = ((L — gh? + + + -) — gel — ght + + - le + GaP + 


following problem, by use of Eqs. (6.4.13), assuming that only the tabulated values of 


+ (-U"[ (poh? +.) + Hel — GRE + eh — GRIF 
$(x) are available, and rounding the results to five decimal places: 

















where e represents the error associated with the value employed for y1, and where 


y’ + 2y = (2), y(0) =1. omitted terms in each expansion are small, of order h*. 





x (2) x (2) 16. Suppose that a spacing h = 0.1 is used in the approximations of Prob. 14, and 
that the value of y: used in the second calculation is assumed to be free of error. 

0.0 1.00000 0.6 1.16412 Calculate the errors and relative errors in the two approximations for values of n in 

ve nee oe apart the neighborhood of 10, 50, and 100, neglecting the effects of round-off errors. 

0.2 1.01980 0.8 1.27059 17. Show that, when Milne’s second-difference method is used, the parasitic part 

0.3 1.04399 0.9 1.32660 ln = u(2n) of the approximate solution of the problem 

0.4 1.07683 1.0 1.38177 . 

0.5 | 1.11730 Mudge aioe 





[Here ¢(x) = cos x + # sin x, and hence y(x) = e-**/? + sin 2,] where A is a constant, is approximated by 


Abs il. 

Section 6.5 iz un © (—1)" 60 ¥° the | endenrs 
i i f Prob. 5 to « = 1 wil 

. to (d) Advance the calculation of the solutions 0 ‘ 
h : aay oa the Adams method, rounding all calculated ordinates to five decim: 

laces and estimating the errors. 

- 10. (a) to (d) Proceed as in Prob. 9, using Eq. (6.2.18). ‘ a 

11. Advance the calculations of Prob. 8 to « = 1 with h = 0.1, (@) by use 0 t 
Adams method and (b) by use of (6.2.18), rounding all calculated ordinates to 


decimal places and estimating the errors. 


when |Ah| is small, where ¢ is the error inherent in 1. 


Section 6.8 


18. If the formula 
Yn4t = Yn + hla ayh 1 + aoy,) 
in used for the numerical solution of the problem 
Section 6.6 
12. (a) to (e) Recalculate the ordinates required in Probs. 9(a) to (d) and 11 by 
of the modified Adams method, again retaining five Places and estimating the erro’ 
18. (a) to (e) Proceed as in Prob. 12 by use of Milne’s method. 


y’ = F(a,y), y(Xo) = Yo, 


lw 1 2 0, ao 2 O, if |Fy(a,y)| S K throughout the calculation leading to y,, and if 
Khe. <1, show that the error e, in y, is bounded by the inequality 


El(i+Khoo\* _ 5]. | cna _ 
ral ec a i] ae 1), 


where 2 is the largest error introduced in a single step. Also specialize to the cases 
fw. = 0, ao = 1), (a1 = 1, ap = 0), and (a1 = ap = %), showing that E is given 
hy i'M, + R in the first two cases and by y'gh*M; + R in the third case, where M;, 
le the maximum value of |y(a)| for a S # S a, and where RF is the maximum 
yound-off error introduced in & single step, 


Section 6.7 
14, Show that the approximate solution of the problem 


y ty =9, y(0) = 1 


afforded by the result of retaining first differences in the formula (6.3.2) would be 
the form 


len| S 


vt? = (54) 
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19. Suppose that F,(z,y) is known to be negative throughout the calculation con- 
sidered in Prob. 18, and also that 


0 <o S —F,(z,y) 
where w is a constant. Show that then e, is dominated by en, where 
(1 + hwa_i)eng1 = (1 — hwao)en + E (n = 0,1,2,.. .) 


and eo = 0, if hwao < 1, and deduce the more useful bound 


E = fae") mw Fl ene 
lel [1 - 1 + hwa_s a e ) 





in this case. ; ; 
20. Suppose that the formula of the Adams method, written in the form 


¥ 
Yn4i = Yn + ho-ay), 41 + hey, + > ORY ey 
k=1 
is applied to the problem 
y’ =Fz,y), —-y(@o) = Yo 
where it is known that 
0<w S —F,(z,y) SK, ' 


where w and K are constants, and assume also that ao 2 0, a = 0, and hway < 1. 
Show that, if the maximum error introduced in a single step is HZ, then the error é, in 


yn 18 dominated by en, where 


r 
(1 + hoa ieny = (1 — hwaojen + hK laxlen_z + E, 
k=1 


if |e.| S ex fork =0,1,2,...,7. Show that one solution of this equation is of th 
form 
en = o — cBo, 
¥. 
where oe eer ae ) lal 
k=1 


that it is possible to take 8 such that 0 < Bo < 1if 6 > 0, and that then 
E n 
len] S @B0" + 55 1 — 80), 


where @ is the absolute value of the largest of the errors ¢0, «1, . vy 
21. Show that the absolute values of the errors in the calculations of Prob, 14 


dominated by “ ‘ 
ek 2—h "| sa a a8 f) ie oer 
+H - GH Bale es aaah dia] 


and 
(F +%) (VTP m+ hy — 1) + lel T+? + 
~ (+P) m= 0 +hd 
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respectively, where R is the magnitude of the maximum round-off error introduced 
in a single step and e« is the inherent error in y, and compare these bounds with the 
direct error calculations of Prob. 16. (Use the results of Prob. 19 in the first case.) 

22, 23. (a) to (e) Calculate values of F, at appropriate stages of the calculations of 
Probs. 12 and 13, and obtain corresponding approximate error bounds, considering 
separately the effects of truncation and round-off errors and using the result of Prob. 20 
when it is appropriate. Estimate the truncation error in each step by approximating 
hny™ by hV™-ly’, in the appropriate error term. 


Section 6.9 


24. If Cis the numerical factor by which —,,1 should be divided to give an approxi- 
mation to the truncation error involved in a calculation of Yn41, based on a pair of 
formulas of open and closed types, when the effect of the term hoiF y(%n41,7n41) is 
neglected (see §6.6), show that the true factor is generally more closely approximated 
by C — (C — 1)pn41, where pn;1 is the convergence factor in the iteration for Yn41- 

25, 26. (a) to (e) Use the data of Probs. 22 and 23 to estimate the convergence 
factors at relevant stages of the corresponding calculations, and determine whether 
the use of the result of Prob. 24 appreciably modifies the previously calculated indi- 
vidual truncation-error and over-all-error bounds. 

27. Use any numerical step-by-step method for the calculation of approximate 
values of the solution of each of the following problems for x = 0.0(0.1)1.0, with an 
error which can be reasonably confidently expected to be less than one unit in the 
fifth decimal place: 

(a) y' =x —y*, y(0) =1. 

(b) y’ =a +sin y, y(0) = 7/2. 

(c) y’ = e*4, y(0) = 1. 


Section 6.10 


28. Suppose that closed formulas of the type (6.10.3) and (6.10.4) are used for the 
merical integration of (6.10.1), that the relevant truncation errors in the nth step 
are By and Ej, respectively, and that another pair of formulas of open type is used for 
prediction of yny1 and un41, with truncation errors E, and E\, respectively. If the 
predicted values are denoted by y{°), and u{%,, the finally calculated values by yn41 
ind Uny1, and the true values by Yn4y1 and Un41, and if the notation 


EF. — FE, E, — Ej 


=C = 


in introduced, show that there follows 
Ynst — Yer = —(C — 1)E2, 
Ynys — Yasui = ha(Un41 = Un+41) + E,, 
Uno — uty = —(C — 1s, 
Uns — Unt = has[Gyn4,(Vngs = Yn) + Gy'n4,(Un41 = Un4i)] + E,, 
whore G,,,, and Gy,,, are appropriate values of G, and G,, respectively. By elimi- 


hating Hy and E;, express Tayi = Ynsi — yogi and Th; = UOna1 — Ung as linear 
' . , . 
Hombinations of yay. ™ Yay: — yO, and Yn41 = Un4s — ul?,, and show that, if 


hla al [Alona Gru + IGyaull « 1, 


#0 that also the convergence factor py): is such that lonyi| & 1, and if it is assumed that 
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C’ = C > 1, then the approximations 


1 
Tri i= C Emel + hee1¥n41)s rs C (Ynys + ha_y ‘Una Yn+1) 
generally provide better estimates than the simpler approximations Tnyi © —7Yn4i/C 
and T),., ~ —7n4,/C. In particular, obtain the estimates 


1 1.7 , 1 ’ 1 
Toor © —elyngt + gorngs Tag © TEC + Fo ym41) 


for the calculations of the illustrative example of §6.10. 

29. Obtain approximate values of the solution of each of the following problems for 
x = 0.0(0.1)1.0, determining appropriate starting values by power-series methods or 
otherwise, and proceeding by use of the modified Adams method, retaining only first — 
differences, estimating the errors introduced in each step, and retaining an appropriate 
number of decimal places in the calculations: 


(a) y”’ —y =0, y(0) = 1, y’(0) = — 

(b) y"” + 2y' + 2y = 0, yO) = 1, y'() = —1. 
(c) ay” +y’ +ay = 0, y(0) = 1, y'(0) = 0. 
d) yy" +y +y? =2,y0) =1, y/0) = 0. 


(e) wu =x2t+u—v*%,u(0) = 
e | iia gtige v(0) = 
30. (a) to (e) Repeat the calculations of Prob. 29, retaining differences through the 
third. 
81. (a) to (e) Use the Milne second-difference procedure in place of the modified 
Adams method in Prob. 29. 


Section 6.11 


$2. Show that, if the formulas 
Ynut = Yn + hai + aoy!); Unyi = Un + hari + aou;,) 


are used for the numerical solution of (6.10.2), with a, 2 0 and a 2 0, then thi 
errors €n and e, are dominated by 


KE’ —LE 
en = AB + AB — 4a” 
i Bo = 1 n Bi -—1 Pia Sy 
as h(a_1B0 + ao) AdBi + h(a@1B1 + ao) AB; h 


respectively, where 


_ 2+ hL(ao — a1) + 2h*Kayoa + VL? + 4Kh 
Z 2(1 — hLa_; — h?Ka?,) 





OW 


thGt VPFFIR ++ 


(f hLo_1 + h?Ka*, <1, where K and L are upper bounds on |G,| and |@,’|, when A 
and A, are assigned any values which ensure that e, and e. are nonnegative forn & 
33. Show that, if the hypothesis |@,| < L is replaced by the hypothesis 


0<ws -Gy 


in Prob. 32, and if h’Ko?, <1 + hwa-i, then L can be replaced by —w through: 
the relations obtained in that problem. 

34. (a) to (e) Use results of Prob, 82 or 33 to obtain approximate error bounds f 
the calculations of Prob, 29. 
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35, 36. (a) to (e) Obtain approximate error bounds for the calculations of Probs. 30 
and 31. 


Section 6.12 


37. Suppose that the formula (6.12.3) is used with no differences to generate an 
approximation to e~* as the solution of the problem 


=%, y(0) = y'(0) = i, 


with spacing h, and that the value used for y; is in error by e, so that y: = e* — «. 
Show that, if all subsequent calculations were effected without round-off, then the nth 
calculated ordinate would be given exactly by 


— f2 
Yn = y (tn) = 5" h (AG i. en + c) en” log (1+h) 


iz 





(a ic 7.52 er + :) e” log [1/(1-h)], 


where z, = nh, and that the approximation 
h € = h € P 
ved (1tgty en —G+a)e 


2, h € ] ~ ew7r h = 
= ¢ » = (G +§) sinh 2 =e -(G+g)e 


would hold when h is small and n large, so that the relative error in y(n) would then 
he approximated by (h? + 2e)e?#»/(4h). Show also that the corresponding relative 
crror in the approximation to e*, with the modified condition y’(0) = +1, would be 
approximated by the constant (h? + 2e)/(4h) when h is small and n large, again 
neglecting round-offs. 

88. Obtain approximate values of the solution of each of the following problems for 
w = 0.0(0.1)1.0, using the Milne procedure (6.12.5) and (6.12.6), and estimating the 
error introduced in each step: 

(a) y’ —y = 0, y@) = 1, y’(0) = -1. 

(b) y’’ + zy = 0, yO) = 0, y’(0) = 1. 

(c) y” + ay + gay? = 0, (0) = 0, y'(0) = 1. 

(d) y” +sin y = x, yO) = 1/2, y'(0) = 0. 

39. (a) to (d) Obtain approximate error bounds for the calculations of Prob. 38. 

40. Obtain approximate values of the solution of each of the following problems 
at the points noted, using the Milne procedure (6.12.5) and (6.12.6) after introducing 
(he transformation (6.12.23), and estimating the error introduced in each step: 

(a) «¥"’ + Y’+a2Y =0, Y(1) = 0.76520, Y’(1) = —0.44005: [x = 1.0(0.1)2.0]. 

(b) Y" +2Y’ + 22Y =0, Y(0) = 0, Y’(0) = 1: [x = 0.0(0.1)1.0]. 

(ec) Y’ +2Y’ + 2Y2 =0, Y(0) = 1, Y’(0) = —1: [a = 0.0(0.1)1.0]. 

41. Show that the equation y’’ + f(x)y = 0 is satisfied by 


y(a) = A(x) cos 6(z), 
O(z) = i v(x) dz + w, 
xo 
if A and v satisfy the equations A’ — Av? ha = Oand 2A’v + Av’ 


where 


= 0, or hence if 


A” +fA= Fy 


ee eee 


Ab 
where ¢ and » are arbitrary constants, Show also that the conditions 


A(ao) = Ao, A'(@o) = 0, A''(ao) = 0, 














TTT. 


256 INTRODUCTION TO NUMERICAL ANALYSIS 


which tend to require that A(z) remain constant near + = 2p, are consistent with the 
conditions y(xo) = yo and y’(xo) = yo if Ao and w satisfy the relations 


, 


: 0 
Ay COS w = Yo, Ao sin w = ny 


and if c = foe AG, 


under the assumption that fo = f(zo) > 0. [This procedure, attributed to Madelung, 
is often useful when f(z) is large and positive, so that y(z) is strongly oscillatory, since 
A(z) often varies much less rapidly. A similar transformation, which is often useful 
when f(z) is large and negative, and y(z) increases or decreases rapidly, may be obtained 
analogously by replacing cos 6 by cosh 8, sinh @, or ein the expression assumed for Y, 
according as the ratio of |yo| to \y)|/(—fo)2 is greater than, less than, or equal to unity, 


respectively.] : 
42. Use the results of Prob. 41 to show that the solution of the problem 


y” + (16 — x*)y =0, y(0) = 1, y'(0) =0 


can be expressed in the form 
y(z) = A(z) cos 0(2), 


where A(z) is the solution of the problem 


A” 4+ (16—2)4 =48, A) =1, A) =, 


zd 
and where (x) = 4 p er 


Also determine A(x), and hence 6(x) and y(x), for z = 0.0(0.1)1.0 to five places, by a 
numerical method. 


Section 6.13 


48, 44. (a) to (e) Advance the calculations of Probs. 12 and 13 to # = 1.2 with 
h = 0.05, given that ¢(1.1) + 1.43392 and (1.2) = 1.48080 in Prob. 8. 
45, 46, 47. (a) to (e) Advance the calculations of Probs. 29, 30, and 31 to z = 1.2 


with h = 0.05. ; 
48. (a) to (d) Advance the calculations of Prob. 38 to x = 1.2 with h = 0.05. 


Section 6.14 
49(a) to (d), 50(a) to (e), 51(a) to (d). Obtain approximate solutions of Probs. 5, 29, 
and 38 for x = 0.0(0.1)1.0, by use of (6.14.2) and (6.14.3). 


Section 6.15 


52. (a) to (e) Obtain approximate values of the solutions of Probs. 5(a) to (d) and 
8 for z = 0.0(0.1)1.0 by use of (6.15.15) and (6.15.16), and estimate the errors. 


Section 6.16 


53. (a) to (e) Obtain approximate values of the solutions of Probs. 5(a) to (d) and 
for a = 0.0(0.1)1.0 by use of (6.16.5) and (6.16.6) and estimate the errors, ‘ 

54(a) to (e), 56(a) to (d). Obtain approximate values of the solutions of Probs, 
and 88 for « @ 0,0(0,1)1,0 by use of (6,16,12) and (6,16,13), and ontimate the errors, 
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Section 6.17 


56. Use an appropriate step-by-step method to determine approximate five-place 
values of u(x) such that wu” + u = 1, u(0) = 0, w’(0) = 0, and of v(x) such that 
vo’ +v =0, v(0) = 0, v’(0) = 1, for x = 0.0(0.1)1.0. Then use these results to 
determine approximate values of y(x) for zx = 0.0(0.1)1.0 such that y’(x) + y(z) = 1, 
y(0) = 0, y(1) = 1, and compare the results with exact values. 

57. Use an appropriate step-by-step method to determine approximate five-place 
values of the solution u(z,) of the problem 


u(t) + u(x) = 1, u(0) = 0, u'(0) =a, 


for « = 0.0(0.1)1.0, taking successively a = 0 and a = 1. Then use linear inter- 
polation to estimate the value of « for which u(1) = 1, and investigate the correctness 
of this estimate by making another corresponding step-by-step calculation (see 
Prob. 58). 

58. Prove that the procedure described in connection with Eqs. (6.17.11) and 
(6.17.12) would yield an exact result with linear interpolation on a if (6.17.11) were a 
linear equation and if no errors were committed in the determination of solutions 
corresponding to two trial values of a. 

69. Obtain approximate values of the solution of the problem 


y’ +y =0, y(0) =0, yd) =1 


for « = 0.0(0.1)1.0 by use of (6.17.18) with h = 0.2. 

60. Repeat the calculation of Prob. 59, using (6.17.17) with h = 0.2, and compare 
(he two approximations. 

61. Repeat the calculation of Prob. 59, using (6.17.19) with h = 0.2, together with 
(6.17.20) and (6.17.21). 

62. Use the method of Prob. 59, together with (6.17.22), to deal with the modifica- 
tion of that problem in which the condition y(0) = 0 is replaced by the condition 
y'(0) = y(0). ; 

63. Repeat the calculation of Prob. 62, using the method of Prob. 60, and compare 
the results with those obtained in Prob. 62. 


Section 6.18 


64. Determine approximate values of the smallest characteristic value of \ for the 
problem 


y" +ry =90, y(0) = y(1) =0 


by use of (6.18.3) and (6.18.4), and compare those approximations with the true value 
«', and with corresponding approximations based on the use of (6.17.18) with N = 1 
and 2, 

65. Repeat the calculations of Prob. 64 when the condition y(0) = 0 is replaced by 
(he condition y’(0) = 0, making use of (6.17.22), in each case, in such a way that the 
order of the procedure is not reduced, and compare the results which correspond to 
the use of the approximate condition Ayo = yi — yo = 0. 

66, 67. Repeat the calculations of Probs. 64 and 65, making use of (6.18.2) with 
N = 3. 

68 to 71. Deal as in Probs, 64 to 67 with the corresponding modified formulations 
jnvolving the equation y” + \xy = 0. [The true characteristic numbers in Probs. 
(4 and 70 are the zeros of the function J}(2\4/3), the smallest of which rounds to 
18.056, whereas those in Probs, 69 and 71 aro the zeros of the function J_4(2d4/3), the 


wnallest of which rounds to 7.8373.) 











CHAPTER 7 

























LEAST-SQUARES POLYNOMIAL APPROXIMATION 


7.1. Introduction. There are two classes of situations in which the 
process of determining an approximation (polynomial or otherwise) to a 
function by fitting given data exactly at a certain set of discrete points 
often is a particularly inefficient one. 

First, when the function f(x), to be approximated, is specified for all 
values of x in an interval, it is clearly desirable to take many or all of the 
known values into account, rather than to select an arbitrary set, consist- 
ing of the least possible number of discrete values which leads to a deter- 
minate set of conditions. This is especially true when f(z) or one of its 
derivatives possesses known finite discontinuities or “ jumps.”’ 

Second, and on the opposite extreme, when only a discrete set of approx- 
imate values of f(x) is provided, and when the degree of reliability of 
those values is not well established, it is foolish (and, indeed, inherently 
dangerous) to attempt to determine a polynomial of high degree which 
fits the vagaries of such data exactly and hence, in all probability, is 
represented by a curve which oscillates violently about the curve which 
represents the true function. In particular, the use of the result for 
numerical differentiation would be hard to justify. 

The so-called method of least squares, which is designed for the treat= 
ment of both these classes of problems, is introduced in the present 
chapter, and its application to the analysis of typical situations is treated. 
Several of the classical sets of orthogonal polynomials, which are partic= 
ularly useful in these applications, are introduced, and certain of their 
properties are discussed. 

7.2. The Principle of Least Squares. In place of determining a poly+ 
nomial approximation y(z), of degree n, to a certain function f(x), by 
requiring that the values of y(x) at a set of n + 1 points agree with known 
exact or approximate values of f(r) at those points, as was done in preced= 
ing chapters, it is often preferable to require that y(x) and f(x) agree ag 
well as possible (in some sense) over a domain D of greater extent. 
domain may be taken as a continuous interval, when f(x) is specifi 
analytically, or as a set, say, of N + 1 points, where N > n. 

When the available data in D are either exact or of equal reliability 

258 
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it is frequently assumed that the “best approximation” over D is that 
one for which the aggregate (sum or integral) of the squared error in D 
is least. This postulate is often known as Legendre’s principle of least 
squares. More generally, if w(z,) is a measure of the relative precision of 
the value assigned to f(x) when x = 2;, the criterion is modified by requir- 
ing that the squared error at x; be multiplied by the weight w(x;) bef 
the aggregate is calculated. “ 
Suppose first that exact values of f(x) are known over a certain domain 
D, which may consist of a discrete set of points xo, 11, . , ty or of a 
continuous interval (a,b), and that the approximation is to be of the form 
n 


f(a) ~ ¥ axdx(2) = y(x), 


k=0 


(7.2.1) 


where do(z), - + + 5 On(z) are n + 1 appropriately chosen functions. In 
particular, in order to obtain a polynomial approximation of degree n 
we could take go = 1, d: = 2, - 5 dn = x", although other choices of 
the coordinate functions, which would also afford a basis for the genera- 
tion of all polynomials of degree n, are often more convenient, as will be 


seen. It 18 suppos t t es fi g nection w(x) 18 non- 
ed ha th peci ed wer htin 


w(x) = 0. (7.2.2) 
If we define the residual R(x) by the equation 
R(x) = fle) — Y ardu(x) = fe) — y(x), (7.2.8) 


k=0 

the best approximation (7.2.1), in the least-squares sense, is defined to be 
that for which the a’s are determined so that the agoregate (sum or inte- 
gral) of w(x) R?(x) over D is as small as possible. It is convenient to 
denote this aggregate here by {wk?}. The requirement 


{wR2} = {w[s = y ands | } = min (7.2.4) 

then imposes the conditions = 
~ [w E ss yaa =0 (r=0,1,...,n) (7.2.5) 
or (wa. | - Y «| Efwhf-y}=0 (7.26) 
or Ysa = {ws} (=0,1,...,n), (7.2.7) 
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and hence leads to n + 1 simultaneous linear equations in the n + 1 
unknown parameters do, G1, - + + 5 Gn. These equations are called the 
normal equations of the process. 

It is useful to notice that these conditions can be expressed also in the 
form 


{w(x)o-(x)R(x)} = 0 (ry =0,1,...,). (7.2.8) 
Hence, since we have also 
[wR*} = (wR R} = {wk [f— J, arde]} = twRF\ — Y avlwok}, 
k=0 k=0 


(7.2.9) 


it follows that, when the coefficients ao, . . - ; 4 satisfy (7.2.7), the corre- 
sponding aggregate squared residual reduces to 


n 


(Rijn = (WRA} = {uff — v)} = (wf"} — ) aalweif}. (7.2.10) 


k=0 


The smallness of this quantity can be used as a criterion for the efficiency 
of the approximation over D. 
In particular, if the domain D consists only of n + 1 discrete points, 
and if the set of functions S, generated by the coordinate functions ¢o, 
. , gn comprises, say, all polynomials of degree not exceeding n, it is 
possible to reduce R(x) to zero at each point of the domain. Thus here 
the least-squares procedure reduces to the determination of the polyno- 
mial y(x) of degree n which agrees exactly with f(x) at n + 1 points, and. 
the minimum value of {wR?} is zero. If the domain consists of N + 1 
points, where N > 1, or of a continuous interval, exact fit over all of 
is generally impossible and the procedure gives the function of the cl 
considered which affords the best approximate fit under the criterio 
(7.2.4), in which the weighting function w(x) must be specified. 
It is seen from (7.2.7) that the coefficients of the unknowns in th 
left-hand members of the normal equations are independent of the fun 
tion f(x) to be approximated, so that they may be precalculated, on 
the coordinate functions and the weighting function have been selected, 
Also, since {w¢id;} = {wos}, the coefficient of a; in the jth equation 
equal to that of a; in the ith equation, so that the array of the coefficients 
of the a’s is symmetrical with respect to its principal diagonal. This fa 
appreciably reduces the labor in both the formation and the solution 
the set of equations (see §10.4). 
Clearly, these equations are greatly simplified if the coordinate fun 
tions are chosen, in advance, in such a way that 


{woidj)) =O (i Jj). (7.2.1 
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A set of ¢’s having this property over D is said to be an orthogonal set, 
relative to the weighting function w(x), over D. For such a set of coordi- 
nate functions, the corresponding set of normal equations (7.2.7) becomes 
“uncoupled”’ and takes the form 

a,{w¢2} = {w¢,f} (rn 20) Ds ee 0) (7.2.12) 
Since w(x) is nonnegative, the coefficient of a, cannot vanish,f and we 
obtain the result 





.,n). (7.2.13) 
Further, reference to (7.2.10) and (7.2.12) shows that the corresponding 
value of {w#?} can be expressed in the alternative form 


{WR} min = {wf?} — > a2{wo?} (7.2.14) 
k=0 


in this case. 


In theoretical work it is often convenient to suppose that the ¢’s have 
also been normalized in such a way that {w@?} = 1, so that (7.2.13) and 
(7.2.14) are still further simplified. However, this normalization is 
rarely convenient in practice. 

The root-mean-square (RMS) error in the approximation over D 
relative to w(x), is defined to be 


(oR, 
[w) 


Here, in particular, when w(x) = 1 the quantity {1} represents the 
length of the interval in the continuous case and the number (N + 1) of 
points in D in the discrete case. 

In the discrete case, it frequently happens that the given data are 
empirical and correspond accordingly to an “observed function’ f(z) 
and that the ‘‘true function” f(x) is not known. Here we must replace 
(x) by f(x) in the preceding developments, and we are in position to 
calculate only érms = (f — y)rus over D. The subsequent estimation of 
the desired quantity erus is considered in the following section. 

7.38. Least-squares Approximation over Discrete Ranges. Before 
exploiting the convenience afforded by the use of orthogonal functions 
wo here consider the application of the general least-squares method to 
the case when the domain D comprises a discrete set of points. The case 
when D is a continuous interval is treated in a completely analogous way. 

In accordance with the results of the preceding section, if an approxima- 

| It is assumed that none of the coordinate functions vanishes everywhere in D, 


erus = (f — y)rus = 





(7.2.15) 
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tion of the form 



















n 


fe) ~ Y ade) 


k=0 


(7.3.1) 


is to hold over a set of N + 1 points Xo, 1, . - + 5 tN; where Nz n, in 
the sense that the aggregate weighted squared error is to be a minimum, 


n 


‘ . 
>, w(x) f(x) — y, Anox(a:)]? = min, 
i=0 


k=0 


(7.3.2) 


the set of n + 1 normal equations (7.2.7) becomes 


N 


N 
ao y w(x) br(as) Poles) + 1 ’, w(x) or(ai)oi(as) +t 


i=0 a 
+ dn ) w(zi)br(es) bn(as) 
‘ mm 
N 
= ¥ wlae(efed) = 01, = +) 
i=0 


These equations can be obtained quite simply, by first writing down a 
N +1 equations which would require that (7.3.1) be an equality at t 


N +1 points 2, 


obo(20) + aidi(to) Fre + Anon(X0) = f(xo), 
ace + ardrlt1) Ft H+ ndn(ti) = fer), 


dodo(an) + agian) + °° * + dnon(tn) = f(xy). 


The rth normal equation is then obtained by multiplying each ee 
by the coefficient of a, in that equation, and by the weight associa 
with that equation, and summing the results. Unless there is a ‘7 
for proceeding otherwise, ie aay are generally taken to be equa : 
igned the value unity. 
nan ee as problem reduces to that of satisfying the n+ 
equations (7.3.4) in n + 1 unknowns, and the normal equations 
i the original ones. 
ay ee ee suppose that the problem is ope . fitti 
the equation of a straight line as well as possible (in the least-squa 
sense) to the following data: 
x 0 1 2 3 4 


———— 


1.00 3.85 6.50 9.35 12,05 
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In place of writing out the equations (7.3.4), corresponding to the sub- 
stitution of these corresponding values into the equation 
a + aw = f(x), (7.3.5) 


we may merely write down the array of the coefficients of ao and a; and 
the right-hand members (the ‘“‘augmented matrix’”’ of the system) in the 
form 


1 0 1.00 

1 1 3.85 | 

1 2 6.50 (7.3.6) 
1 3 9.35 

1 4 12.05 


Under the assumption that all the data are of equal significance, we take 
all weights equal to unity. The first normal equation then corresponds 
to the result of adding the elements of the respective columns of (7.3.6), 
to give the array [5 10 32.75], and the second corresponds to the result 
of multiplying the elements in each row by the element of that row which 
lies in the second column, and adding the results, to give the array 
{10 30 93.10], so that the normal equations are 


Sao + 10a: = 32.75, 
10a) + 30a: = 93.10, 


yielding the solution a) = 1.03, a; = 2.76, and hence determining the 
linear approximation 


(7.3.7) 


f(x) ~ y(x) = 1.03 + 2.762. (7.3.8) 


The values obtained from this approximation at the points x = 0, 1, 
2, 3, and 4 are 1.03, 3.79, 6.55, 9.31, and 12.07, respectively, and the sum 
of the squared errors is found to be 0.0090. Thus the RMS error for 
these five points is about 0.042. 

The interpretation of this result must depend upon the context. If 
the given values are considered to be exact values of a true function, then 
the figure 0.042 represents the RMS deviation of the approximate func- 
lion y(x) from the true function over the five points for which information 
in available. In the absence of any further information, this figure would 
wlford the only available estimate of the RMS error over the continuous 
rngeOS as 4. 

On the other hand, if the given ordinates are empirical, and hence 


properly correspond to an ‘‘observed function” f(x), the figure 0.042 
represents only the RMS error in approximating f(x) over the five relevant 
points. 
tions are made, no conclusions can be drawn with respect to the RMS 
value of the true error f — y, 


Unless additional information is supplied, or additional assump- 
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However, if it is postulated that the true function is such re “a 
residuals at each of the N + 1 points can be reduced to zero, aie a — 
impossibility of achieving this end in the case at hand is os us a 
ence of independent random errors in the several ade va ie a - 
it is possible to obtain a certain amount of additional in ‘ae ee 
is also frequently desirable to estimate the errors in the ca 
ee cal these purposes, we examine the general problem in greater 

il i remainder of this section. 
ee ee the right-hand members of (7.3.4) are as hie: 
of an observed function f(x), and that the calculated coefficients a 
denoted by a, . . . ; dn, So that those relations become 


n 


Y, aedeles) ~ Flea) 


k=0 


p20 ec, (7.3.9 


whereas the proper equations are 


, axdi(as) = f(a) = f(a) + E(x), (7.3.10 
k=0 
where E(z,) is the error associated with the “observed value” f(a) 


The normal equations (7.3.3), associated with (7.3.9), can then be writt 
in the form 


n 


Cred = V4 (ry =0,1,...-,7) 
k=0 
N 
where Crk = Chr = w (ai) br (as) be (2i) 
= 7+=0 
and v=) w(a)d(7f(ar). 
i=0 


The corresponding approximation Da.¢.(x) may be denoted by 9(w 
d the residual f(x,) — 9(x:) by R(a). 
Tt we denote by C,, the cofactor of c,. in the coefficient array of (7.3.11 


Coo Co1 Con 
Cio Ci Cin (7.8.1 
Cno Cnt Cnn 


and define the “reduced cofactor” C,, = Cr,/D, where D is the 9 
minant of the array, noticing that C,, = C,, because of the a 4 
the array, the solution of the set (7.3.11) can be expressed in the 
























(a, 
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(see §10.2) 


n 
a, = > Crd, (r= 
Es 


In order to express the d’s directly in term 
introduce (7.3.13) into (7.3.15), to obtain 


Oicanw. (7.3.15) 


s of the given ordinates, we 


t=0 


a, = ) Ci , w(xi) be(xs)f(a,) = ) f , Crade(s) | w(x) f(a). 
k=0 i=0 k=0 


Thus, if we introduce the abbreviation 


®,(x) = y Cude(2), (7.3.16) 
k=0 
this relation takes the form 
N 
ee ») w(ai)®, (x) F(a). (7.3.17) 
i=0 


Accordingly, if a, denotes the corresponding coefficient calculated from 
the true ordinates, there follows also 


N 


a, — a, = Y w(x) 8, (2) E(w). (7.3.18) 
i=0 
This relation gives the difference between the coefficients actually 


obtained and those which would have been obtained if no observational 
errors (or round-off errors) were present. 


If the assumption is made that f(x) is actually a member of the set Sn 
of all functions expressible as linear combinations of do, . » bn, then 


/(x) is truly specified by the constants Ao, 


- » Gm, and the difference 
il, 


-d, is the true error in the calculated value of the rth constant 
That is, we then have f(z) — 9(x) = Xa, — G,)¢,(z). 

(enerally, only bounds on the observational errors E(x;), or estimated 
inean values of their squares over a set of observations, are available in 
practice. In the latter case, the weights w(2o), 
(juently so chosen that the mean values of w(xo) E?(20), 
Aro (approximately) equal. 
from (7.8.18) the relation 


- , Wey) are fre- 


. , Wty) E?(2y) 
Under this assumption, we may first obtain 


 Gr)® = [Ww (20) ®? (x0) |[w (ao) H?(ao)] + + + « 


+ [w(aw)F(xw)][w(aw)E*(an)] + +++, (7.3.19) 


Whore the omitted terms at the end involve products of the form 
N(x) E(x) where i » j. If both sides of this equation are averaged over 
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many observations, if the mean of the product E (x;) E(2;) isyassumed to 
be zero when i # j, and if the common mean value of w(xo)E?(t0), . . + 5 
w(xy)E?(ay) is denoted by (wE?)m, there then follows 


N 


(a, — G,)2, = (WHY)w Yl) 2). 
i=0 


(7.3.20) 


This result can be put into a more convenient form. For this purpose, 
we notice first that if f(z;) is identified with ¢,(x,) in (7.3.9), where 
0 <s <7, there follows G, = 4,., where 5,; = 1 when r = j and 0 other- 
Thus we deduce from (7.3.17) that 


N 
J, w(es)® (0) b(t) = Bro 


i=0 


wise. 


(7.3.21) 


so that reference to (7.3.16) gives 
N N n 3 
Y wesleaor = Y wee) | Y Cadel | 
i=0 ‘=0 k=0 


= > [ w(x) ®,(xi) x(x) | Cu 
i=0 


a 
k=0 


bnCn = Cry. (7.3.22 


k 


ih 


Thus (7.3.20) can be written in the form 
(a, — 4)2, = SF (wE*)m, 


where C,, is the cofactor of c,, in the array of (7.3.14), and D is the det 
minant of that array. Here the subscript m again indicates the formati 
of the mean over a set of observations. 

It may happen that no explicit information with respect to the err: 
E(z;) is available, but that the assumption that [w(x,)Z?(x;)|m is indepen 
ent of 7 is to be retained, together with the assumption that the true fu 
tion can be fitted exactly at the N + 1 points involved. In this case, 
is possible to obtain an estimate of (wH*)m in terms of the calew 


residuals R(ao), . . . , R(aw), such that 


, N). 
(7.3, 


R(x) = Fle) — ge) = Je) — Y ade) (= 0, . 


k= 


— 
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For this purpose, we notice fi i 
have also e first that, since E(x,) = f(x) — f(x), we 


S(&i) — G(x:) = E(x) + R(a;). (7.3.25) 
From (7.3.10) and (7.3.24), there follows 
E(x) + R(a,) = > (Qk — Gx) bx(as) (7.3.26) 
or, after using (7.3.18), = 
_— ~ nal 
Be) + Rw) = ¥ dle) Y wee, E(e,). (7.3.27) 
k=0 »=0 


If we multiply both members of (7 R 
. 3.26) by w(x) R(x) and 
?, making use of the fact that Zw(x;) b:(x,) R(x) = 0, in Sanaa nik 
(7.2.8), there follows ii 


N N 
2) w(x) E(x) R(x) = — Z. w(x.) R2(2;), (7.3.28) 


Also, by multiplying both members 
& : of (7.3.27) by w(2x,)E(z,), summi 
over t, and making use of (7.3.28), there follows en negate 


a N N 
2, w(x,)[f(ai) — G(x)? = > w(x;)E?(a2,;) — ) w(a,) R?(a;) 
n N va =e 
= on 2. py w(x;) bi(0;)®;(x,) w(x) E(x) E(2,). (7.3.29) 


If we now average the equal members of (7 
ee (7.3.29) over a set of - 
vations, and again assume that [E(x;) E(2;)|m = 0 he ey ee 
[1 (xr) H°(i) Im es (WH) m for t=0,..., N, only the terms for which 
y ~ 7? will remain in the right-hand member, and there follows 


N 
V fw(x)[f(a) — 9211, = v 5 
A, [w(a;) [f(ai) G(x) ])?]Jm = (N + 1) (wk), — ») [w (2,) B2(2;) | 
gtk i=0 
= >», [> w(x) x(a) Pe(x,) | (wH?)m = (n+ 1)(WEY)m, (7.3.30) 
“ince the sum in brackets in the second right-hand i 
(7.3.21), Thus there follows SW ANARRGE Teper ae UY 


N 
P 1 
(WHn = ys), wl) R em (7.3.81) 
4, 
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Since only the residuals which correspond to the single set of given 
ordinates are available (in general), the best known approximation to 
the mean values of &?(z;) consists in the set of calculated values. Thus 


we deduce the formula es 


w(x) 


N 
(wE?) m ~ WV = rs ») w(2;) R2(x;) _ re 
i=0 


a 





(7.3.32) 


=2 
€RMs> 


where the last form follows from the notation of (7.2.15). 

It is convenient to summarize the preceding results in the case when 
w(x) = 1, which is of most common occurence. Here é2us is the - 
value of the squares of the N + 1 residuals R(x;) = F(xi) — G(x), so tha 


_ _ .faR@)? 
€rMs = N cis 1 


measures the RMS deviation between the observed function and its cal- 
culated approximation over the N +1 points involved. Equatio: 
(7.3.32) affords the estimate 


NE 1y.: = 4 Real 
Exus ~ WN — nM = N—-n 


(7.3.33) 


(7.3.34 


for the RMS deviation between the true function and the observed fun 
tion over those points, and the combination of (7.3.23) and (7.3.32) giv 


Con Cr  [2{R@)? 
(6a,) rus es JF Eras YS VG Re, 


as an estimate of the RMS error in the rth calculated coefficient d,. 
each case, N + 1 denotes the number of points employed and n + 1 th 
number of independent coordinate functions. Both (7.3.34) and (7.3.8 
are essentially based on the assumption that the true huni 0 can 
expressed in the form Ya,¢x(x), for some choice of the a’s, and are to 
used in the more general case with a corresponding degree of cauti 
The estimate (7.3.34) is properly meaningless when N = n, since then 
the data are needed to determine the approximation, and no data re 
he estimation of the error. f 
Be the given data in the preceding example are empirical, the figu 
éxus = 0.042 thus represents the RMS deviation between the obser 
function and the ‘smoothed function” over the five relevant poi 
Since also Coo = 30, Cur = 5, and D = 50 in (7.3.7), (7.3.35) gives 


(dao) rms = 0.8E ams, (801) nus lead 0.3L ams, 


where Exus is the RMS value of the observational errors, under 


(7.3.3 
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assumption that the true function is linear. 


Also, if use is made of 
(7.3.34), with n = 


1 and N = 4, we obtain the estimate 


Exus ~ ~/§ (0.042) = 0.055, 


under the same assumption. Accordingly, the RMS errors in ay and a; 
then may be estimated as about 0.044 and 0.016, respectively. On the 
other hand, if an independent estimate of the RMS value of the observa- 
tional errors were available, a comparison of that estimate with the 
estimate obtained here would serve to indicate the validity of the assump- 
tion that the true function is indeed linear. 

The solution of the normal equations, and the evaluation of the relevant 
determinant and cofactors, can be conveniently effected by the use of 
procedures described in §§10.4 and 10.5. 

The same methods are used more generally in dealing with sets of 
linear equations, in which there are more equations than unknowns, 
whether or not they arise from a problem (7.3.1) in “curve fitting.” 
In general, the original set is inconsistent, and does not possess a solution. 
The normal equations then correspond to the result of minimizing the 
(weighted or unweighted) squared deviations between the right- and left- 
hand members of those equations. If the squared deviation associated 
with the kth equation is to be weighted by wx, the same end result can be 
obtained alternatively by multiplying both sides of that equation by 
Vw, and using a unit weight in forming the normal equations. In this 
connection, it should be noticed, for example, that the equations x = 2.3 

und 5x = 11.5 are not equivalent, if the right-hand members are known 
only to be correct to the places given, since the first assertion is equivalent 
lo 2.25 < & < 2.35 and the second to 2.29 <a < 2.31. 

In this more general case, the coefficients of the left-hand members of 
the original equations, as well as the right-hand members, may be subject 
‘oerror, Here, if the normal equations are again represented by (7.3.9), 
und if rms represents the RMS error of each of the right-hand members 
of the original equations, whereas nus denotes the RMS error of each 
coefficient in the original set, the estimate (7.3.35) is to be replaced by 





(6ay)rus © NE WV eus + (a? t+a?pe ee + a2) nus, (7.3.36) 


when w(@) = 1, under the assumption that all errors are small, random, 
wud independent, and that the RMS errors in the coefficients of the 
original equations are all equal. 

7.4. Orthogonal Polynomials. We consider next the case when a 
lonst-squares approximation is to be effected over the interval (a,b), and 


We attempt first to construct a set of polynomials ¢o(z), Gi(@) yi ome, 
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¢,(x), . . . such that each member is orthogonal to all others in the set, 
over (a,b), relative to a specified weighting function w(x) which is non- 
negative over that interval. It is convenient to ask that ¢,() be a poly- 
nomial of degree r. The problem then will be solved, in particular, if we 
obtain a polynomial ¢,(x) which is orthogonal over (a,b) to all polynomials 
of degree inferior to r. 

Thus we require a polynomial ¢,(z), of degree r, such that 
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L w(x) ¢-(2)qr1(x) dx = 0, (7.4.1) 


where w is specified and where q,_1 is an arbitrary polynomial of degree 
r — lorless. In order to express this requirement in a more useful form, 
we integrate by parts r times, making use of the fact that qi, =0. For 
this purpose, we first introduce the notation 


d'U,(2) 


w(x) ge(0) = SO, 





(7.4.2) 
so that (7.4.1) becomes 


I U{? (x) qr-1(x) da = 0 
or, after r integrations by parts, 


[UYPqa — USPalg + USPgily — + (=U aah = 0. 
(7.4.3) 
The requirement that the function @,(x) defined by (7.4.2), 


d(x) = a tn 


w(x) 
be a polynomial of degree r implies that U,(x) must satisfy the differentia 


equation 
grt! 1 eV.) =, 
da | w(x) = da” 7 


in (a,b), whereas the requirement that (7.4.3) be satisfied for any valu 
of gr—1(@), Qr—1(b), @/_1(a), gf_1(b), and so forth, leads to the 2r bounda 
conditions 
U,(a) = U;(@) = Uv(a) = ++ + = US-Y(a@) = 0, 
U(b) = UL(b) = Ui'(b) =. - + + = Us-%@) = 0. 


Thus if, for each integer r, a solution of (7.4.5) which satisfies (7.4.6 
and (7.4.7) can be obtained, the rth member of the required set of fun 
tions is given by (7.4.4). From the homogeneity of these conditions, 
follows that each such solution will contain an arbitrary multiplicati 
constant. It is known (see Szego [220]) that the problem thus formula 
does indeed possess a solution, even when a and/or b is infinite, under 





(7.4.4) 





(7.4.5 


ll 
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assumptions that w(x) = 0 in (a,b) and that [ : x*w(x) dx exists for all 
a 
nonnegative integral values of k. 


In accordance with the results of the preceding section 7.2, the coeffi- 
cients in the expression 


n 


u(e) = ¥ ag,(x) 





(7.4.8) 
r=0 
are then determined by the requirement 
b 
J) @LF@) = y(e)} dx = min (7.4.9) 
in the form 
b 
ib who, dx a who, ax 
a, = == 7% 
i wo? de Ys ’ (7.4.10) 


where the numerator depends upon f, whereas the denominator y, is 
independent of f and can be calculated once and for all. 


The calculation of y, is facilitated by the following considerations. If 
we write 


o(z) = Aw + Ana +--+ + Az", 


so that A, is the coefficient of x* in (x) and A, = A 
coefficient, there follows 


(7.4.11) 


rr 18 its leading 


Y= f w(x) (x) b-(x) dx 


= J. w@)4.(2) [Ar + Ane + +++ + A,x'] de, 


and hence, if we recall the relations 


i  w(x)d(a)e'de=0 (=0,1,...,r— 1), (7.4.12) 


which are equivalent to (7.4.1), we may deduce that 
b 
= A, f° arw(x)d-(0) dx = A, [i uC) ae. 
ly integrating by parts r times, and making use of (7.4.6) and (7.4.7), 


(his relation takes the convenient form 


Y, = f ” w(x) 2(x) da = (—1)rlA, i ” U,(x) de, (7.4.18) 


where A, is the coefficient of x in (x). 
Principally for later reference (§8.4), it is shown next that if w(x) does 


nol change sign in (a,b), the polynomial ¢,(«) possesses r distinct real zeros, 
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all of which le in the interval (a,b). In order to establish this fact, we 
notice first that, since f words dx = Ao f wo, dx = 0 whenr 2 1, and 


w(x) is of constant sign, ¢,(z) must change sign at least once in (a,b) when 
r= 1. Now let those real zeros of ¢,(x) which are of odd multiplicity, 
and which lie in (a,b), be denoted by ¢:, ¢2, . ~ + 5 Gm and assume that 
m <r. Then the product 


(ce —a)(e@ —e) +++ @- Cm) br(X) 
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does not change sign in (a,b). But, since m <7, the coefficient of $,(z) 
is a polynomial of degree less than r, and hence, by (7.4.1), we must have 


ih w(x)[(e — e1)(a@ — e2) + + + @ — em) or(2) de = 0. 


However, since w(x) does not change sign in (a,b), the integrand therefo 
has the same property, and a contradiction follows. Hence there m 
follow m = 7, and since the total multiplicity of all zeros is equal to 7 
all roots must be real and distinct and must lie in (a,b), as was to be sho 

It is useful to notice that the problem specified by (7.4.8) and (7.4.9 
can be generalized in the following way. It may happen that f(x) clear 
cannot be satisfactorily approximated over (a,b) by a polynomial of lo 
degree, but that a certain function v(x) is known such that the rati 
f(a) /v(a) can be so approximated. Thus, if we determine the coefficien 


of the relation 


> bebe(2) 


r=0 


ine (7.4.1 


in such a way that 


i, ie) | s S' neo} dr = min, 


=0 


the orthogonality of the ¢’s relative to w leads to the result 


1 [(’w 

by = ae > for de. 

It is seen that (7.4.15) is equivalent to the result of minimizing 

squared error (f — y)? with the weighting function w/v. The choi 
w(x) = v(z) is a frequently useful one. 

Several examples of such approximations are considered in the follo 

sections. ' 

7.5. Legendre Approximation. For least-squares approximation 

an interval of finite length, it is convenient to suppose that a linear cha 

























LEAST-SQUARES POLYNOMIAL APPROXIMATION 


273 


in variables has transformed that interval into the interval (—1,1). We 
consider here the case when the weighting function is unity ~~ 
| 


wr) = 1. (7.5.1) 
The differential equation (7.4.5) then becomes 
d*+1U, 
ae = 0, (7.5.2) 
and the boundary conditions (7.4.6) and (7.4.7) take the form 
U,(+1) = U}(+1) = +++ = US-¥(+1) = 0, (7.5.3) 
from which there follows (analytically or by inspection) 
U, = C(x? — 1)’, (7.5.4) 


where C, is an arbitrary constant. Hen i 
Fe ; ce, from (7.4.4), it foll 
the rth relevant orthogonal polynomial is of the form are 


—_— dr 
¢-(x) = C, dxt 
With C, = 1 /(2r!), the polynomial so obtained is known as the rth 
Legendre polynomial and is usually denoted by P,(z) 


(~? — 1)r. (7.5.5) 


 i1l@ 
P,(z) = Ser dar ey (7.5.6) 
l'rom the preceding derivation, it follows that 
1 
[-,P-@P.@) de =0 (#5), (7.5.7) 


where r and s are any nonnegative integers. The value assigned to C, is 
much that P,(1) = I, and it is true also that |P,(x)| S 1 when |z| < L. 
I'he first six of these polynomials may be obtained in the forms ~ 


P(x) = 1, P(x) = 2, P2(x) = 4(82? — 1 
P3(x) = ¥(5x? — 32), P(x) = 1350! — 30x? a 
P(x) = 3(63x5 — 702% + 152), 


und additional ones can be determined from the recurrence formulat 


(7.5.8) 


(7.5.9) 


2r+ 1 
Prii(t) = eT tPe() ~ ar P,_-1(2). 


I) may be noted that the i 
polynomials of even and odd d 
and odd functions of x, respectively. aiiiiaaie 


| Mor a derivation of this formula, and of other similar formulas to be listed without 


(lorivation in the remainder of this chapter, see Szego [220]. 
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In order to evaluate the factor (7.4.13), we notice first that (7.5.6) 


gives 


interval into the interval (0,0) by a translation of the origin. A fre- 
quently used approximation makes use of a weighting function of the form 


1 a xg, Ly aT ee RN, ee 
P(x) = CRE: (x2 — ret? + )= 2r(rl)? ’ w(x) = e-*, (7.6.1) 
qos (2r)t where a is a positive constant, taken to be sufficiently large to ensure the 
so that "  Qr(r!)? existence of the integral of the squared error over the semi-infinite inter- 


val (when this is possible). 


Hence (7.4.13) gives ; 
From the results of §7.4, the relevant orthogonal polynomials are such 








1 
Y= : P2(x) dx = (2r)! aol (1 — «?)" dx that 
(aa =} 2'r\ 2Qrr! -1 au, 
_ (rl Qetryt 2 (7.5.10) Gre) = em A162) 
~ 22r(rl)? (Qr +1)! 2r+1 
; ie Poaihe where cl ew AOE 0 (7.6.3) 
Thus the nth-degree least-squares polynomial approximation to f(x) arti ae 6. 
over (—1,1), relevant to a constant weighting function, is defined by and where U,(0) = U}(0) = +++ = USO) = 0 (7.6.4) 
7 and U,(©) = Ul(o) = +++ = UP(o) = 0. (7.6.5) 
i= a,P,(2) (=1:<¢ <1), (7.5.11) The general solution of (7.6.3) is readily found to be 
r=0 ; U, =e (oo tae +++ +e) +d tdwt: ++ +dya-}, 
2r +1 P,(x) dx (7.5.12) . 3 
where a, = —3— af (a) P(x) de. ; where the c’s and d’s are arbitrary constants. The conditions (7.6.5) 
: less. 4 require that all d’s vanish, and (7.6.4) gives c) =¢c1 = +++ =¢,; = 0, 
It has the property that, of all polynomials y,(z) of degree n or less, the so that there follows 
integrated squared error U,(x) = C,are-**, (7.6.6) 
1 Ir 
if _, F@) — yaa)? ae and hence (x) = Cee a (x"e-#2), (7.6.7) 


is least when y,(.) is identified with the polynomial defined by (7.5.11) 


ini is gi With C, = 1 and & = 1, this polynomial is called the rth Laguerre poly- 
Tn virtue of (7.2.14), that minimum error 1s given by 


nomial, and is usually denoted by L,(z), 
dt 3 
L,(x) =e i (are-*), (7.6.8) 
lt follows that, again taking C, = 1, the polynomial (7.6.7) can be 
expressed in the form 


1 . 2a2 
2 pas Fen 
V3 si » or +1 
r=0 


In accordance with (7.4.14) to (7.4.16), it follows also that the leas 
squares approximation to f(x) of the form 





$,(x) = L, (ax), (7.6.9) 


is : bP,(2) tee y (7.5.18 wnd that we have the orthogonality property 
y(a) = v(x rhe = ’ i. ~ 
Res 1; cL,(ax)L, (ar) dx =0 (rs), (7.6.10) 
Mee ‘ 2 i ecified fu 
relevant to the weighting function 1/[v(x)]’, where v(x) 18 a 8p when r and s are nonnegative integers. 


tion, is that for which 


: ‘The first six of the Laguerre polynomials can be obtained in the form 
thvae Roa | I) p(w) ae. (7.501 


Lo(«) = 1, Ii(z) = 1-2, Dox) = 2 — 4x 4+ 2, 
Ly(a) = 6 — 18a 4+ 9a* — x8, La(x) = 24 — 96x 4+ 72x? — 16x? + 21, 
L(x) = 120 — 6002 +- 6002? — 200a5 +- 254 — x, = (7.6.11) 





2 1 v(a) 


7.6. Laguerre Approximation. Tor least-squares polynomial appr 
imation in a semi-infinite interval, it is convenient to first transform 


























276 INTRODUCTION TO NUMERICAL ANALYSIS 


and additional ones can be determined from the recurrence formula 
Lryi(t) = (1 + 2r — x)L,(a) — r?L,-1(2). (7.6.12) 


The value assigned to C, is such that the coefficient of 2" in L,(azx) is 
(—a)". Hence, from (7.4.13), there follows 


co) 2 1 
Y= i e-**2(ax) dx = arr! i wre da = = (ri)*, (7.6.18) 


Thus the nth-degree least-squares polynomial approximation to f(z) 
in (0,~), relevant to the weighting function w(x) = e~**, is defined by 


n 


y(x) = > a,L (ax) (0<24< »), (7.6.14) 
r=0 


a 


where a, = aE i, ef (x) L, (ax) dex. (7.6.15) 


It has the property that, of all polynomials y,(x) of degree n or less, the 
integrated weighted squared error 


Jy° elf @) — yale) dex 


is least when y,(x) is identified with the right-hand member of (7.6.14) 
In order for this integral to exist, it is generally necessary that |f(x)| gro 
less rapidly than e**/? as x7 — ©. 

Another type of approximation employing Laguerre polynomials i 
obtained if we require the coefficients in the relation 


y(2) = e-* > bbe) (0<2< 0) (7.6.16) 
r=0 
such that 
i e[f(x) — y(x))2 dx = 1 eat [ ¥@) — ena > buln(ac) |" de = min 
k=0 


(7.6.1 


This is a special case of the problem specified by (7.4.14) to (7.4.16) 
which v(x) = w(x), and the coefficients are thus obtairied in the form 


b, = cm |, S(x)L,(ax) da. (7.6.1 


In order that the integrals in (7.6.17) exist, it is generally necessary t 
f(x) tend to zero more rapidly than e~*"/* as x#—> @, 
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7.7. Hermite Approximation. Over the doubly infinite interval 
(—°© <2 < ©), a frequently used weighting function is of the form 


w(x) = ew? (7.7.1) 
In this case the relevant orthogonal polynomials are defined by 
r(x) = ex’ oe (7.7.2) 
where U, satisfies the equation 


ati [ dU, 
ari le au =0 (7.7.3) 


and where U, and its first r — 1 derivatives are to tend to zero as 
wa to, 


Since the function 





U,(x) = C,e-@?* (7.7.4) 
has the property that its rth derivative is the product of itself and a poly- 
nomial of degree r, it satisfies these conditions, and there follows 


242 d" 2z2' 
o(x2) = Cer a7 (e-@72") , (7.7.5) 


The Hermite polynomial of degree r is usually defined by taking 
', = (—1)’ and, in addition, either a? = 1 or a? = 1 in (7.7.5). Both 
dofinitions are used in the literature. We adopt the former one and write 

Tox? ar — x2 
H,(x) = (—1)re ae); (7.7.6) 
so that, with the choice C, = (—a)—, (7 -7.5) becomest 


242 dt 242" 
$,(z) = H,(ax) = (—a)-"e** i (e-272"), (7.7.7) 
‘Thus these polynomials possess the orthogonality property 


[7 ee" (ax)H(ax) dz =0  (r #8), (7.7.8) 
when r and s are nonnegative integers. The first six of the polynomials 


ilefined by (7.7.6) are obtained in the form 


A(x) =1, Hy(x) = Qe, H,(x) = 4x? — 2, 
H;(x) = 8x8 — 122, H,(x) = 16x4 — 482? 4 12, (7.7.9) 
H,(x) = 325 — 160a° + 120z, 
| With the definition H,(z) = (—1)re**/? dr(e-**/2) /dzr, and the choice 
Cy = (—1)'2-r7/41g-*, 


there would follow ¢,(x) = Hy(4/3 aa). 
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and additional ones can be determined from the recurrence formula 
Hy4i(z) = 20H,(2) — QrH,-1(2). (7.7.10) 
With A, = (2a)* and U, = (—1/«)" exp (—a?x?), Eq. (7.4.18) gives 
. . ‘ vl 
Y= i e-@="H2(ax) dx = 2rr! [ 2 ew dx = =a a/r. 
Thus the nth-degree least-squares polynomial approximation to f(a) 


over (—%,+), relevant to the weighting function w(x) = eo" 
defined by 


(7.7.11) 


n 


ye) = Ys alia) (-# <#<*) 


r=0 


where e-@ HH (ax) f(x) dx. 


a o 
oe Orr! ~/r | a 
It has the property that, of all polynomials yn(x) of degree n or less, 7 
integrated squared error 


[ets — yale) de 


is least when yn(x) is identified with the right-hand member of (7 .7.12) 
It must be assumed that the behavior of f(z) is such that this integr 
sh con be noticed that, since the weighting function e~%** becom 
small very rapidly as x increases in magnitude, the least-squares criterl 
here requires that the magnitude of the deviation f(x) = y(a) be ong 
when z is small, but tolerates large values of that deviation when # 
large in magnitude. A similar remark applies somewhat less strongly 
the approximation of the preceding section. Thus, such approximatio 
should not be used unless this situation is an acceptable one. 

Another type of approximation, of particular importance in the theo 
of statistics, is obtained if we require the coefficients in the relation 


ya) = ee Y bH(az) (<< *) 


r=0 


(7.7.1 
such that 
[2 e9@ — v@)P de 


= prs ens? [ #@) — ena y by, H, (ax) ii dx = min, (7.7.1 
k=0 


The conditions governing the b’s are obtained directly, or by reference t 
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(7.4.14) to (7.4.16), with v(x) = w(x), in the form 

b, = —*— |” fle)H(ax) dx 7.7.16 

aly | SeHelea) ae, (7.7.16) 


assuming that the behavior of f(x), for large values of |x|, is such that 
the integrals involved exist. In particular, the approximation (7.7.14) 


is often used in situations when f(x) vanishes for all values of |x| which 
exceed a certain value. 


If the 7th moment of f(x) is defined as 


m= Tee a'f(x) da, 


and use is made of the explicit forms of (7.7.9), we find that the leading 
coefficients in (7.7.14) are expressible in the forms 


(7.7.17) 


a a? a 
bo = ——= Mo, 


Vr Va 4/r 


and that the remaining b’s can be similarly expressed in terms of the 
moments. 

7.8. Chebyshev Approximation. In cases when errors near the ends 
of an interval (a,b) are of particular importance, a weighting function 
which is of the form 1/+/(x — a)(b — 2) is often useful. It is supposed 
again that a linear change in variables has transformed the given interval 
into the interval (—1,1), so that the weighting function becomes 


1 
V1 — 2? 
In order to obtain the relevant orthogonal polynomials in this case, it 


is convenient to start with the basic condition (7.4.1), rather than with 


its consequences. Thus we require a polynomial ¢,(x), of degree r in 2, 
such that 


b= 


mi, by = 





(2a?me = Mo), 


(7.7.18) 


w(x) = - 


(7.8.1) 


* br(a)qr1(2) 
—1 V1 — 2 


where g,-1(z) is an arbitrary polynomial of degree r — 1 or less in z. 
I{ we introduce the change in variables 


dx = 0, 





(7.8.2) 


x = cos 0, (7.8.3) 


(his requirement becomes 


{- ¢;(cos 0)gr—1(cos 0) dd = 0. (7.8.4) 


Now, since cos ké is expressible as a polynomial of degree k in cos 6 and 


since, conversely, any polynomial of degree k in cos @ can be expressed 
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as a linear combination of 1, cos 6, cos 26, . . 
(7.8.4) will be satisfied if and only if 


. , cos ké, it follows t. 


ii ¢,(cos 6) cos ka dé = 0 (k=0,1,... 


It is easily verified that the function 
¢,(cos 6) = C, cos ré 


has this property. Hence, returning to the variable x by using (7.8.3 
we verify that the functions 


¢,(x) = C, cos (r cos“! x) 


are the required orthogonal polynomials. With C, = 1, these polyn 
mials are known as Chebyshev polynomials,{ often denoted by T,(2), 
that we may write 


¢(xz) = T(x) = cos (r cos 2). 

Thus, these polynomials possess the orthogonality property 
1 1T,(«)T (x) 
at Y 1-2 


when r and s are nonnegative integers. 
are obtained in the form 


T(x) = 1, T(x) = 2, T(x) = 22? — 1, 


T3(x) = 403 — 32, T s(x) = 84 — 82? + 1, 
T(x) = 16° — 20z* + Sz, 


(7.8. 


dz = 0 (r # 8), (7.8. 


The first six of these polynomi 


(7.8. 


and additional ones may be determined from the recurrence formula 
Tri1(4) = 2aT,(x) — Tr«»(z). (7.8. 
In order to evaluate the factor 
wa [ 2 
: 1V1l-@ 


we again write « = cos 6 and T,(x) = cos 78, so that there foll 


directly 
ie Tv 
= cos? 76 dé = f 
o 2 


Thus the nth-degree least-squares polynomial approximation to f( 


+ The name of Chebyshev (or Tschebycheff) is associated with various 
polynomials in the literature (see also §§7.11 and 8,13). 


dx, (7.8. 


(7.8. 


tion can be formulated in terms of other weighting functions. 
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(—1,1), relevant to the weighting function w(x) = 1/+/1 — 2?, is defined 
by 
y(x) = > aT, (x) (-l<2< 1), (7.8.14) 
r=0 
where 
1 [' _ f@) 2 [' f(@)T-(x) 
age td AE = PASZEaC7 
hee Gaye a, 2 f a ae (r # 0). 
(7.8.15) 


It has the property that, of all polynomials of degree n or less, the inte- 
grated weighted squared error 


1 1 
——a — 2 
iy VJ/1 — x (f(x) Yn(x)] dx 
is least when y,(x) is identified with the right-hand member of (7.8.14). 
On the other hand, if we wish to approximate f(x) by the product of 
1/+/1 — x? and a polynomial, over (—1,1), with the weighting function 
/1 — x?, we are to determine the coefficients in the relation 


. 1 XN 
y(x) Tiae >, b,T-(2) 


r=0 


(-1 <2 <1) (7.8.16) 


such that 


[| VI=FU@ - wep ae 





m [. ‘Mas i@) = As donc a sat fein, 


The conditions determining the 6’s are obtained in the form 


i T,(z) fie oe va) bea) | dx = 0, 


and the use of (7.8.9) and (7.8.13), or of (7.4.14) to (7.4.16), yields the 
determination 


ryt 2 fi 
bb == [10 dt, ob == [sore dz  (r#0). (7.8.17) 


A great variety of other types of least-squares polynomial approxima- 
In partic- 
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ular, for the weighting function 


w(x) = (1 — x)*(1 + 2)8 (7.8.18) 


over (—1,1), which reduces to the Legendre case when a = 6 = 0 and 
to the Chebyshev case when a = 8 = —3, the rth orthogonal polynomial 
is readily found to be of the form 


(a > —1,8 > —1), 


de() = Cy(1 — a)-e(L + #2 (1 = aC + 2), 





(7.8.19) 


which may be identified with the rth Jacobi polynomial when C, is suit- 
ably specified (see §8.9). 

In particular, the factor C, for T(x) is given by (—2)*r!/(2r)!, so tha 
(7.8.8) can also be written in the form 
(—2)'r! 


(2r)! 


Analogous polynomials S,(x), which are associated with the weighti 
function w(x) = (1 — 2x?)}, and which can be expressed in the form 


sin [(r + 1) cos1 a] _ (—2)r"(r + 1)! 


4@ : 
sin (cos~! x) ~  (Qr +1)! ae ae @ — 29 
(7.8.21 


T(x) = 








ae (1 _ x?)r-, 


— 7%) 
(1 v)! 


(7.8.20 


S,(x) = 





are considered in Prob. 31. 
For the weighting function 


w(a) = wbe-ex (@ > —1,a> 0), (7.8. 


over (0,0), there follows 


belt) = Cyar-Beer © (ptre-an), (7.8.2 
dx" 

and the resultant polynomials are frequently called Sonine polynomi 
or generalized Laguerre polynomials (for additional information see Sze; 
[220]). 

Before proceeding to a corresponding treatment of the case when dé 
crete data are involved, it is desirable to establish certain analo 
between integration and summation, and to obtain certain special pro 
erties of the binomial coefficient functions and related functions, whi 
play the same roles in summation and differencing as do the functi 
1,z,... , 2 in integration and differentiation. 

7.9. Factorial Power Functions and Summation Formulas. The pr 
uct s(s — 1) +--+ (s — n+ 1), where n is a positive integer, is of 
called the factorial nth power of s, and the notation 


3” = 9(g@—1) +++ (8 —n+1) (7.9. 
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is frequently used. It is related to the binomial coefficient function by 


the equation 

s g™) 

n = 
In the more general case when n need not be a positive integer, (7.9.1) 
is generalized by the definition 


(7.9.2) 


ni 








I(s + 1) 
(nr) = => 
s§ T(s Sn 1’ (7.9.3) 
in accordance with which there follows, in particular, 
sO = 1 (7.9.4) 
and s™ = 0 when 7 — s is a positive integer, (7.9.5) 
In order to establish the usefulness of the notation (7.9.1) or (7.9.3) 
we notice that, from (7.9.3), there follows 
AiO se TOF) Tet) -( sti _,\ Tse+1) 
I(s—n+2) I(s—n+1) s—n+1 T'(s—n-+1) 
wit I'(s + 1) ee Cee |) 
(s—n+1)I(s—n+1) T'(s — n + 2) 
or Ais™ = ns*—), (7.9.6) 


where A; denotes the forward-difference operator with unit spacing, and 
where use is made of the fundamental property of the gamma function, t 


Tu + 1) = uP (u). (7.9.7) 


Thus the factorial power s™ is related to the operator A; just as the 
ordinary power 2" is related to the operator D = d/dz. In this connec- 


lion, it is of interest to notice that Newton’s forward-difference formula 
(4.3.5) can be written in the form 


Afo 


f= fot at gD 4 an gD fone ob ats sm) Yael 9) g(nt)) 


(n+ 1)! 


und is seen to be completely analogous to the Maclaurin series, with a 
remainder, expressed in the form 





(7.9.8) 


fer) (g) 
(n+ 1)! 


‘The combination of (7.9.2) and (7.9.6) leads to the relation 
gi) 


geal) (6) = a = oop 


| It is often convenient to write u! as an abbreviation for Tu + 1), even though 
& in not a positive integer, 


fe) = (0) + 2 + LOM ey... 4 LO) 


ent, 


nsr—D 
n! 
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and hence to the corresponding useful property 


= (;) 7 (' ; ') 3 eo = (, . 1) (7.9.9) 


of the binomial coefficient functions. 
We recall next that, from the telescoping of terms in the expansion 


N 


y Aife = (fui — fu) + (fuse — fas) 


s=M a a (fx — fr-1) + (fv41 — fw), 
it follows that 
N+1 


, (7.9.10) 





N 
) Aafe = far — fu = fa 


s=M 


This general relation is seen to be analogous to the relation 
b 
[°F @) az = fb) - F@ = fa) |, 


but careful notice should be taken of the fact that the limits on the righ 
in (7.9.10) are not the same as those on the left. Thus, in particular, 
may deduce from (7.9.10) and (7.9.6) the summation formula 








N 
gaaty [N+1 
() = ~ —1), (7.9.11 
> : n+1\u ir ) 
s=M 


which clearly corresponds to the integral formula 


b ee grt |e 
a7 ale a | 





(n ~ —1). 

Since (7.9.8) permits any polynomial in s. to be expressed as a lin 
combination of factorial powers of s, (7.9.11) then serves to effect 
summation of that polynomial. In illustration, in order to express 
sum 

S,=1-:3+2-4+---+n(n+ 2) 
in closed form, we could first obtain the relation 
s+ 28 = 0+ 38 + 3®, 


by use of undetermined coefficients or by using (7.9.8), and then make 


calculation 
n 


Bem >, [3s 4+ g@] = [89 + de(d]n+1 
sel 


= §(n + 1)n + a(n + 1)n(n = 1) = y(n + 1)(7 + Qn), 
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The summation could also be effected by making appropriate use of 
(5.8.4) or of the Euler-Maclaurin sum formula (5.8.12). 

There exist a large number of useful identities involving either factorial 
power functions or, correspondingly, binomial coefficient functions. In 
particular, the relation 


8M (s — n)\® = ginth, (7.9.12) 
which follows immediately from the fact that the left-hand member is 
given by 
Iss —1) +++ (s—n+1)][(s — n)(s—n—1)--- (s-—n—k-+1), 
is of frequent use. 

From the definition (7.9.2), we obtain the property 


(*) ~ 6 - .) (7.9.13) 
wre y ©) a, (7.9.14) 


k=0 


Also, from the fact that 


when x is a nonnegative integer, we deduce in particular that 


n 


s (") = 2" (7.9.15) 


k=0 
and > (—1# (7) = 0, (7.9.16) 
k=0 
by setting x = landz = —1, respectively. 


Many other important relations are obtainable from the identity 


(2 ))= > (i). 2) - » (,7 (2) csan 


=0 
when n is a nonnegative integer. In order to establish this relation, we 
inultiply together the series expansions of (1 + 2)? and (1 + x)", noticing 
\hat those series terminate when p and m are nonnegative integers and 
ure absolutely convergent infinite series when || < 1 otherwise, to obtain 
tho results 


rere [¥()-][3.0-]- L506) 


y X(t) (ale dal <. 


new 
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But since also 


C) 


(lL + x)™t? = y (° a °) oe (|x| < 0), 


and since the coefficient of ~* must be the same in these two forms, the 
first form of the desired result (7.9.17) follows. The second form results 
from interchanging p and m. Asa special case, we may take m = p =n 
in (7.9.17) and use (7.9.13) to obtain the result 


al 2 a) (7.9.18) 
& 


We may notice next that 


(=?) _ (=p)(=p — 1) a (—p —¢ +1) 
! 








ome 1 a amt 2) a; OY <e Dp 
= (-1) (p+a-1)(p H ] 
an | 
and hence ( #) = (—1) (? 7 ; \ (7.9.19) 


when q is a nonnegative integer. Hence we deduce from (7.9.17) and 
(7.9.19) the further relations 


n=) Sem rt) 
_ )) (—1)4 & ee . (”). (7.9.20 
= 


All these formulas can be expressed alternatively in terms of factorii 
power functions, by making use of (7.9.2). 
Finally, a general formula which will be needed in the sequel is the su 
analogy to integration by parts. In order to derive the desired formul 
we notice first that 
 Aytyde = Us-Vs41 — Us = Vspi(Usz1 — Us) + Us(Vs41 — Ve) 
Us AwWs + Ve41 AUs. ' 


Hence, by transposition, and use of (7.9.10), we deduce the formula f 


summation by parts in the form 
N+1 x 
M 7 om Vor Ais. 


mM Uy Ay = Uns 
= 
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Also, by replacing u, by v, and », by u.—1, and transposing terms, we 
deduce the alternative form 





N wes N 
2 Us AWs = Us—qs eh SS > Vs AyUs_1. (7.9.22) 
s=M s=M 


7.10. Polynomials Orthogonal over Discrete Ranges. For least- 
squares approximation over a discrete range R, it is convenient to make 
use of a set of polynomials which are mutually orthogonal under summa- 
tion over R, with respect to a specified weighting function. We suppose 
that N + 1 points are to be employed in the approximation, with uniform 
separation h, and that the extreme points are at the ends of the interval 
(a,b), where b — a = Nh. If we then write 


x=a+sh, (7.10.1) 
the variable s takes on the values s = 0,1,2,...,N at those points, 
and we seek a set of polynomials $0(8,N), $1(8,N), . . . , w(s,N) such 
that ¢, is of degree r in s, and such that 

N 
>, w(s)4-(8,N)qea(s) = 0, (7.10.2) 
s=0 


where w(s) is a specified weighting function, assumed to be nonnegative 


in R, and where g,_,(s) is an arbitrary polynomial of degree r — 1 or less 
in R. 


The procedure is analogous to that employed in §7.4. We first set 


w(s)¢,(s,V) = AtU,(s,N), (7.10.3) 
so that (7.10.2) becomes 

N 

> (U-(s,N)lax(3) = 0, (7.10.4) 

s=0 


und sum by parts r times, noticing that Aig--1(s) = 0, to transform 
(7.10.4) to the relation 


(145-8) }qra(s) — {AF-2U,(8 + 1)} Aigea(s) + = - 
+ (-I)HU 8 +r = 1)} Ags) EY! = 0. (7.10.5) 


Since we require that ¢, be a polynomial of degree r, it follows from 
(7.10.3) that U, must satisfy the difference equation 


Ay Ee aiU,(6,N) =Q (7.10.6) 





















288 INTRODUCTION TO NUMERICAL ANALYSIS LEAST-SQUARES POLYNOMIAL APPROXIMATION 289 


in R and, because of the arbitrariness of g,1(s), A1g,—1(s), and so forth, 
when s = Oands = N + 1, that U,(s + r — 1, N), A.U,(s + r — 2, N), 

. , and At1U,(s,N) must vanish when s = 0 and when s = N + 1. 
Ut is easily seen that these requirements are equivalent to the 2r conditions 


U,-(0,N) = U,(1,N) = U,(2,N) = +--+ = Ur —1,N) =0 (7.10.7) 


In order to express this result in a more explicit form, we first expand 
(s — N — 1) in terms of factorial powers of (s — r), then use (7.9.12) 
to express (7.11.1) in terms of factorial powers of s, and, finally, make use 
of (7.9.6). Thus, if we make use of the second form of (7.9.20), we obtain 


Kaw sara C oe Wotan) 


= (-1)7! » (—1)* a (‘ k ‘) (7.11.3) 


k=0 


and 


=::: =U,N+1r,N) =0. (7.10.8) 


Once U, has been determined, necessarily with an arbitrary multiplica- 
tive constant, there follows, from (7.10.3), 
1 


or(s,N) = eG) AiU,(s,N). (7.10.9) 


and hence, in virtue of (7.11.1) and (7.9.12), 





—1)* = 
U.(8,N) = (—1)rlCyw » —- e 5 a) st), (711.4) 
In consequence of the results of §7.2, the coefficients in the relation k=0 








7 Thus, by making use of (7.9.6), we obtain the result 
3) = a,o,(s,N) (7.10.10) ie 
ae » (r+khO(N—k 
ee $,(8,N) = (—1)r!C.y ) (—1) EE Ng pe. ALD) 
are then determined, by the requirement k=0 
N n : which can be transformed easily to the more convenient form 
Y) 2) [s6) — DY aditen) [= min, (7.10.1 
3=0 r=0 ; (r + k)@H g% 
in the form $,(8,N) = Cw » (=1)* ED? NG’ (7.11.6) 
N k=0 
a, = <m} >, w(s)f(s)¢r(s,N), (7.10.12 where c,w has been written for the arbitrary constant (—1)'r!N©C,y, 
bs s=0 ‘The expanded form appears as follows: 
N 
" iN) = Y’ w(9)92(6,N). (7.10.1 _ facto et (r= Dr + I(r +2) 8s — 1) 
— me Ba ee fan ChE NW = 1) 
7.11. Gram Approximation. We restrict attention here to the o Bey ere ci We aaNet) NOF= tay +... + 
when w(s) = 1. Equation (7.10.6) then requires merely that U,(s,N , (7.11.7) 
be a polynomial in s of degree 2r and, since (7.10.7) and (7.10.8) de oid. 


mine its 2r zeros, there follows immediately 


U.(8,N) = Crwis(s — 1) - + + (8 — r+ II[(s -— N -— Y(s — N — 2) 
ie aan N = 2M. 
or U,(s,N) = C,vs(s — N — 1), (7.11, 


where C,w is an arbitrary constant. Hence we have also In such cases, it is also convenient to make the change of variables 
rN . 
or(s,N) = Crw At[s(s — N — 1)), (7.11, a= M +1, (7.11.8) 


In most applications of least-squares methods, it is convenient to make 
use of an odd number of ordinates, so that N is even, and to write 

















290 


so that ¢ represents distance from the mid-point of the range # in units 
of the spacing h (see Fig. 7.1) and takes on the values 0, +1, +2,..., 
+M at the 2M + 1 points of RF. 

If also we choose 
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Cw = (—1)" (7.11.9) 


and write p,(t,2M) = ¢,(s,2M), the polynomials of degrees zero through 
Le Mh >. Mh ——>} 
| 
— 
ee ee 
a h b 
Fie. 7.1 


five can be expressed explicitly as follows: 


po(t,2M) = 1, 

t 
pi(t,2M) = M’ 

2 — M(M +1) 
pr(t2M) = 82 MO +), 





MM — 1) 
Bt? — (3M? + 3M — 1)t 











7.11.10 

p3(t,2M) = M(M — 1)(2M — 1) ? ( ) 
35¢4 — 5(6M? + 6M — 5)? 4+ 8M (M2? — 1)(M + 2) 

Pl 2M) OM (M — 1)\QM — 1)2M — 3) 

ps(t,2M) = 


63i° — 35(2M? + 2M — 3)t + (15M* + 30M* — 35M? — 50M + 12)t 
3M(M — 1)(M — 2)2M — 1M — 3) 7 


These polynomials thus possess the orthogonality property 





M 


pi(tj,2M)p,(t2M) =0  (¢ ¥ J) (7.11.11) 


i=—M 
and are usually known as Gram polynomials (or as Chebyshev polynomia 
although the latter name is usually reserved for either the polynomi 
considered in §7.8 or those to be considered in §8.13). It may be s 
that the rth polynomial is an even function of ¢ when r is even, and 
odd function of when7is odd. Also, each polynomial takes on the val 
unity when t= M. Further, if ¢ is replaced by Mz.and M is th 
increased without limit, it can be verified that p,(¢,2M) tends to the 
Legendre polynomial P,(2), 


lim p,(M2,2M) = P,(). 
M- © 


In accordance with the results of the preceding section, the nth-de 
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least-squares polynomial approximation to f(é) over the (2M + 1)-point 
range’ =-—M,—-M-+1,..., —-1,01,...,M— 1, M is given by 

n 
y(t) = a,p,(t,2M) (7.11.12) 
r=0 
M 
where ite > f(t)p.(t,2M), (7.11.13) 
Yr ond 
M 
and where Yr = y pr(t,2M). (7.11.14) 
ti=—M 


As in the earlier developments, the factors y, are independent of the 
function f(¢) which is to be approximated and can be calculated once and 
for all (see Prob. 42). * 

It should be noted that various conventions are adopted in the litera- 
ture with regard to the value assigned to the arbitrary multiplicative 
constant c,y in (7.11.7) in the general case. In particular, that constant 
is sometimes so defined that the coefficient of s” in ¢,(s,N) is unity. 
Another common choice is that for which the values taken on when s = 0, 
l,..., N are integers without a common factor, so that tabulation is 
simplified. When N + 1 points are used, the sum of the squares of the 
N + 1 tabular values of the rth-degree polynomial corresponding to the 
normalization (7.11.9) used here is found to be 


(N+r+1)"N —7r)! 
(2r + 1)(N})? 


(see Prob. 42), whereas (7.11.7) shows that the coefficient of s" is 


(2r)'(N — r)! 


These results permit tabulations relevant to other normalizations to be 
interpreted in terms of the one used here. 

7.12. Example. Five-point Least-squares Approximation. In order 
lo illustrate the method of using the preceding results, we consider here 
the case in which only five ordinates are used, so that M =2. The 
relevant orthogonal polynomials of degrees zero through four are then 
obtained from (7.11.10) in the forms 


pot) = 1, pill) = at, 
pi(t) = q(5t® — 170), 


if we write p,(¢) for p,(t,4). 





p2(t) = a(t? as 2), 


pat) = yy(B5¢* — 15502 + 72), (7-12-1) 
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We may notice that p;(t), as defined by (7.11.10), is nonexistent when 
M =2. This situation corresponds to the fact that the use of polyno- 
mials of degrees zero through five over five points would not lead to a 
determinate problem, since infinitely many fifth-degree polynomials 
would fit the data exactly at those points. Further, the use of polynomial 
of degrees zero through four over five points truly would not be a “‘least- 
squares” procedure, since it necessarily would lead to the fourth-degree 
polynomial which fits the data exactly. Thus pa(t) is not needed when 
five points are used unless an exact fit at those points is desired, in which 
case the use of methods given in earlier chapters is usually to be preferred, 

Values of the polynomials at the five relevant points are easily deter- 
mined and may be tabulated as follows: 
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y= 5 2.5 3.5 10 


According to (7.11.12) to (7.11.14), the coefficient a, of each p,(t) used 
in the approximation is obtained by multiplying each entry in its colum 
by the corresponding entry in the column of values of f(é), capa a 
dividing the result by y,, which is listed at the foot of the p, colum 
Once the a’s are calculated, the least-squares polynomial y(t) can 
obtained explicitly by forming the corresponding combination of the pol 
nomials listed in (7.12.1). If only the value of y(t) at a tabular i 
required, this explicit form of y(t) is not needed, since the require “a 
is obtained by merely multiplying the tabulated value of each p, for t 

,, and summing the results. 7 
a heures ce that we are provided with the empirical da 


xz )0.0 02 04 06 0.8 


eee ee 


1.10 1.78 2.74 4.12 5.69 





F(z) 


together with some assurance that the observed values-are in error oe 
more than a few units in the last place given, and that the true funct 
is “smooth.” In order to obtain least-squares polynomial approxi 
tions by use of Table 7.1, we then set x = 0.4 + 0.2t, ort = 5x — 2, 
write F(0.4 + 0.2t) = f(#). Calculation then gives 


ay) = 3.086, a, = 2.304, a, = 0,314, a, = —0.009, 


LEAST-SQUARES POLYNOMIAL APPROXIMATION 293 


so that least-squares approximations of degrees one, two, and three are 
obtained by retaining two, three, or four terms in the relation 


f(t) = 3.086 po(t) + 2.304pi(t) + 0.314p2(t) — 0.009p;(2). 


The corresponding ‘‘smoothed” values at the tabular points may be 
obtained, from Table 7.1, as follows: 





t —2 —1 0 1 2 
f |1.10 1.78 2.74 4.12 5.69 
ys | 1.105 1.759 2.772 4.099 5.695 
y2 | 1.096 1.777 2.772 4.081 5.704 
yi | 0.782 1.934 3.086 4.2388 5.390 


The RMS value of the five deviations from the observed values is found 
to be 0.0198 for the third-degree approximation, 0.0235 for the second- 
degree approximation, and 0.264 for the linear approximation. The use 
of (7.3.34) then leads to corresponding estimates of 0.0443, 0.0372, 
and 0.341, respectively, for the RMS error in the observed values. 
Clearly, only the first two of these estimates are in accord with the given 
information. 

If the smallest of these estimates is accepted as the most appropriate 
one, we may conclude that the additional smoothing afforded by the use 
of a parabolic approximation, in place of a cubic, probably represents a 
further removal of ‘‘noise” rather than a departure from the unknown 
true function. 

If additional values of the least-squares polynomial are desired, they 
may be obtained conveniently by interpolation. However, if the equa- 
tion of the parabolic approximation is required, it may be written down in 
the form 

y = 3.086po(t) + 2.304p1(t) + 0.314p2(t) 
= 2.772 + 1.152¢ + 0.157¢ 


und reduced, if so desired, to the form 
y = 1.096 + 2.620x + 3.9252?. 


In particular, this result supplies the approximations 2.62, 5.76, and 
8.90 to the slope of the unknown function at z = 0.0, 0.4, and 0.8, respec- 
lively, whereas the third-degree approximation would yield the values 
2.30, 5.89, and 8.58. On the other hand, the result of differentiating the 
fourth-degree interpolation polynomial, which takes on the five observed 
values exactly, would give the respective values 3.40, 5.89, and 7.48. 

By expressing the a’s explicitly in terms of the observed values, we may 
obtain formulas which express the smoothed values directly in terms of 
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the observed ones. Thus, corresponding to the third-degree least-squares 
approximation over five points, we obtain the formula 
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Yo = Ao — a2 7 
5 ag at fot fi the) — #f-2 — ofa — fo — afi + fe) 
or Yo eet ee + 12f_1 4 Ih + 12f1 — 3fe), (7.12.2) 


for the smoothed value at the mid-point ¢ = 0, and the formulas 


ya = Fy (69f-2 + 4f-1 — 6fo + 4f1 — fr), 

ya = gs (2f_2 + Q7f_1 + 12fo = 8fi + 2fe), (7.12.3) 
y1 = ps(2f_2 — 8f_-1 + 12fo + 27fi + 2f2), 
yo = to(—f-2 + 4f-1 — 6fo + 4f1 + 69f2), 


are obtained in a similar way. It is of interest to notice that these for- 
mulas can also be expressed in the simple forms 


y-2 =f-a— prof, ya=Satssdfo, yo = fo — ws 54fo, 
yi = fi + 355fo, yx = fo — rod*fo. 


The simplicity of these last forms is due to the fact that the degree of 
the least-squares polynomial is exactly one less than the degree of the 
polynomial which would be uniquely determined by the five data. In 
the cases when this difference exceeds unity, the formulas are less simply 
expressed in terms of differences, particularly for off-center points, 

en. 
sera formulas which avoid the necessity of effecting the summatio 
may also be obtained by first resolving the relations (7.12.1) in the for 


1 = po(t), t = 2p,(d), i? = 2po(t) + 2pr2(t), 
& = 4[34pi(t) + 6p,(¢)I, it = ys[238p0(t) + 310p2(¢) + Ee 


(7.12.4) 


Now the interpolation polynomial (of degree four) which agrees exactl 
with f(f) when ¢ = 0, +1, and +2 can be expressed in the Stirling for 


fo + (udfo)t + (46"fo)t? + (4yd°fo) (t8 — t) + (rd4fo) (4 — #), (7.12.6 
and hence, by introducing (7.12.5) into (7.12.6), we obtain the relation 
t) = (fo + 8%fo + F84fo)po(t) + (2udfo + SHdfo) PW) 
ies + (8*fo Ei 254fo)pa(t) + (ud*fo)ps(t) + (rod‘fo)pa(t), (7-12. 


when ¢ is restricted to the values 0, +1, and +2. For other values of 
the right-hand member represents the fourth-degree interpolation po 
nomial which coincides with f(#) at those five points, The associa 
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error E(t) can be expressed in the familiar form 


E® =F - INE 4 (el <2), 
when f*(/) exists and is continuous for —2 < ¢ < 2. 

Since the right-hand member of (7.12.7) is accordingly the polynomial 
which would be afforded by fourth-degree least squares, relative to the 
five points involved, and since the coefficients of the p’s are independent 
of the number of p’s retained, it follows that the third-degree least- 
squares polynomial relative to those points is then obtained by deleting 
the term involving p,(¢). In particular, when attention is restricted to 
the five points themselves, the resultant formula can be expressed in the 
form 


y(t) = f® — (rod4fo)pal(t) (7.12.8) 


in accordance with (7.12.4). Similarly, the first-degree least-squares 
polynomial relevant to five points may be obtained by retaining only 
po(t) and p,(¢) in the right-hand member of (7.12.7). 

The methods of this section are readily generalized to cases in which 
more than five points are used in the least-squares calculation. 

7.13. Smoothing Formulas. In place of approximating f(t) by a single 
least-squares polynomial of degree n over the entire range of an extensive 
\abulation, it is frequently desirable to replace each tabulated value by 
the value taken on by a least-squares polynomial of degree n relevant to 
a subrange of 2M + 1 points centered, if possible, at the point for which 
the entry is to be modified. Thus, except for points near the ends of the 
range of tabulation, each smoothed value is obtained from a distinct 
least-squares polynomial. In this section we list certain sets of smoothing 
formulas which are obtainable for this purpose by the methods of the 
preceding section. 


lor first-degree least-squares approximation relevant to three points, the 
formulas are of the form 


y-1 = e(5f-1 + 2fo — fi) =f-1 — FO%Fo, 


(t = 0, +1, +2), 


yo = 3(f-1 + fo + fr) = fo + F8%fo, (7.13.1) 
Yi = e(—f-1 + 2fo + 5f1) = fi — 48%o, 
whereas the formulas relevant to five points are 
Yy-2 = 5(8f_2 + 2f_1 + fo — fa), 
y-1 = ro(4f_2 + 3f_1 + 2fo + fr), (7.13.2) 


Yo = B(f-2 + fatfothfi t+ fo), 


where the omitted formulas, for y; and ys, are obtained from the formulas 
for y1 and y_9 by reversing the numbering of the ordinates, Thus, for 
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example, if first-degree five-point least squares were to be used, the central 
formula would be used for all values except the first two and the last two, 
for which the off-center formulas would be used. 

The formulas for third-degree five-point least squares were obtained in 
the preceding section and are listed again, for convenient reference, in the 
forms 


y_2 = Py (69f_2 + 4f_1 — 6fo + 4f1 — fe) = f-2 — 7v6*fo, 
Yy-1 = ps(2f_2 + 27f_-1 + 12f0 — 8f1 + 2fe) = fi + v56*fo, 
sis(—3f_2 + 12f_1 + 17fo + 12f1 — 3f2) = fo — 355*fo, 


(7.13.3) 


whereas the corresponding seven-point formulas are 


y—s = 25(39f_s + 8f-2 — 4fa — Afot fi + 4fe — 2fs), 

Y-2 a2(8f_s + 19f_» + 16f_1 + 6fo > 4fy — The + 4fs), 

y-1 = gy(—4f_s + 16f_2 + 19f_1 + 12f0 + 2f1 — 4f2 + fs), 
Yo= ar(—2f_s + 3f_2 + 6f-1 + Tho + 6f1 + 3fe aa 2fs), 


(7.13.4) 


Finally, the fifth-degree seven-point least-squares formulas may be lis 
as follows: 


y-3 = yor (923f_3 + 6f_2 — 15f_1 + 20fo — 15f1 + 6f2 — fs) 


= fs — saxd'f 
y-2 = r3a(f-s + 148f_2 + 15f_1 — 20fo + 15f1 — 6f2 + fs) 4 
= fot rsadhf 
y-1 = gha(—5f_s + 380f_2 + 233f_1 + 100fo — 75f: + 30f2 — 5fs) 
= fir — saad 
Yo = xhr(5f_s — 30f_-2 + 75f_1 + 131fo + 75f1 — 30f2 + 5fs) 


6 ee elie a ee he ew ee em ee fe Se Be a ee 8 8 8 8 6 


The use of an nth-degree least-squares polynomial relevant to 2M + 
points essentially assumes that the true function can be approximated 
some nth-degree polynomial over each subrange of 2M + 1 points, b 
it admits the possibility that no single nth-degree polynomial may 
satisfactory over the entire range. The amount of smoothing incre 
with the number of points used in the smoothing formula and decre 
with increasing values of the degree n. 

It is often desirable and convenient to employ a smoothing techniq 
involving a relatively small number of points, so that the relevant f 
mulas are of simple form, and to iterate the process as many times 
appears to be desirable. The degree n is chosen to be as small as p 
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sible, in consistency with the assumption that differences of the true 
function, of order higher than n, are small. If such a process were 
iterated indefinitely, the sequence of smoothed functions would tend to 
the least-squares polynomial of degree n relevant to the entire range of 
tabulated values. The computer can and generally must rely upon his 
judgment with regard to the stage at which the iteration is to be ter- 
minated, so that most of the “‘noise”’ is eliminated but essential character- 
istics of the function are not appreciably modified. The choice of n is 
often dictated by the fact that the first differences of the observed func- 
tion f are fairly regular, whereas the (n + 1)th differences fluctuate 
erratically and have a mean value near zero. 

As an illustration, we consider the data listed in the second column of 
the following table. 
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TABLE 7.2t 
5-point | 5-point 
f(x) once twice | Spencer | W. and R. 
0 431 402 405 419 
1 409 423 422 422 
2 429 444 439 435 
3 422 459 456 454 
4 530 469 472 473 
5 505 483 485 487 
6 459 504 499 496 
7 499 510 516 508 
8 526 527 536 526 
9 563 554 557 550 
10 587 584 585 582 578 
11 595 612 616 614 610 
12 647 649 650 |- 648 646 
13 669 683 684 682 685 
14 746 720 720 716 724 
15 760 756 752 749 758 
16 778 792 784 787 
17 828 810 815 812 
18 846 841 847 837 
19 836 876 880 868 
20 916 914 922 910 
21 956 960 966 961 
22 1014 1019 1012 1016 
23 1076 1061 1060 1069 
24 1134 1106 1107 1112 
25 1124 1152 1154 1141 


| These data were taken from Spencer [215] and have been analyzed in various ways 
hy Spencer, by Whittaker and Robinson [20], and others. 


A plot of the given data suggests that, whereas the true function is 
ulmort certainly not linear, it can be fairly approximated by a linear 
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function over any subrange of, say, three or five points. The smoothed 
data given by the first-degree five-point formulas of (7.13.2) are listed in 
the third column of the table. Each smoothed value except the two 
values at each end of the tabulation is obtained very simply as the aver- 
age of the five values centered at the point considered. Off-center for- 


F(z) 
1200 


1100 
1000 
900 
800 
700 
600 


500 





0 5 10 15 20 25 
Fia. 7.2 


mulas are used for those points. A second application of this proc 
leads to the values listed in the fourth column and is represented by 
continuous curve in Fig. 7.2. A quantitative estimate of the degree 
smoothing is afforded by the fact that the means of the absolute values 
the second and third differences of the given data are 41 and 75, resp 
tively, whereas the corresponding means for the results of the secon 
smoothing are 2.1 and 2.6, respectively. At the same time, it appe 
that the characteristic trend of the data is preserved in the smoothing, 

The results of applying the first-degree three-point formulas of (7.13.1 
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three times are found to be quite similar to the results of using the five- 
point formulas twice, in the present example. 

In the fifth column of the table are listed the results obtained by 
Spencer, by use of an elaborate 21-point formula which yields smoothed 
values only at points which are more than 10 intervals away from the 
ends. 

The sixth column of the table lists results obtained by Whittaker and 
Robinson, by use of another elaborate 21-point formula combined with 
an appreciable amount of auxiliary calculation relevant to the smoothing 
of the first and last 10 entries. Whereas the smoothed values generally 
do not differ appreciably from those obtained (much more simply) in the 
fourth column, the advantage in smoothness actually belongs to the 
results of the simpler method, in the sense that the mean absolute second 
and third differences relevant to the data of the sixth column are found to 
be 5.2 and 3.4, respectively, as compared with 2.1 and 2.6 for the data 
of the fourth column. 

It should be emphasized, however, that the smallness of certain mean 
absolute differences cannot in ztself be taken as an indication of a satis- 
factory smoothing. By repeating the smoothing which led to column 
three indefinitely often, we would eventually be led to a “smoothed 
curve”’ which is represented by a straight line over the entire range, and 
hence for which all differences of order greater than one would vanish. 
This linear approximation would be obtained directly by use of first- 
degree 26-point formulas. 

It is, of course, conceivable that the deviation from linearity of the 
smoothed curve is still predominantly “noise” and that a much more 
ilrastic smoothing is indeed called for. It is at this point that the judg- 
ment of the computer (or the weight of additional evidence) must be 
brought into play. 

As a further example, a plot of the data 


x Oo 1 2 3 4 5 6 7 #8 





54 145 227 359 401 342 259 112 65 





f(x) 


(see Fig. 7.3) suggests that the true function can be approximated by a 
third-degree polynomial over each subrange of five points. The use of 
(the formulas of (7.13.3) yields the smoothed values 


x ie J 2 3 4 5 6 7 8 


57 134 244 348 393 352 242 124 62 





y (a) 


which are plotted and joined by a continuous curve in Fig. 7.3, 
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Whereas it is possible to determine a set of orthogonal polynomials 
over a discrete range relative to a specified weighting function w (an 
important special case is treated in §9.5) and to derive corresponding 
smoothing formulas, a more convenient procedure which tends to accom- 
plish about the same purpose, when w does not vary excessively, consists 
in applying the preceding smoothing formulas to the product wf and then 
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dividing the result by w. More generally, the function f may be fi 
transformed in an appropriate way to a new function g, and the ne 
function g may be smoothed, after which the inverse transformation ma, 
be applied to the smoothed function. In particular, in the case of t 
last preceding example, the graph of the function f(x) (Fig. 7.3) indicates 
a resemblance to a function of the form exp [— (Aa? + Ba + C)] and su 
gests that the smoothing be applied to log f(x), rather than to f(x) its 

Finally, it may be pointed out that the central smoothing formulas ¢ 
be obtained rather simply without explicitly determining the | 
squares polynomials involved. In this connection, we notice that 
orthogonal polynomials p,(¢,2M) defined in §7.11 vanish at ¢ = 0 wh 
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risodd. Hence, if n is even, the central smoothing formula corresponding 
to a least-squares polynomial of degree n will be identical with the formula 
corresponding to that of degree n + 1. 


For n = 0 or n = 1, there follows merely yo = ao and hence, since 
po(t,2M) = 1, 


M 
= 1 
Yo = 2M ab 1 Sr (7.13.6) 


r=—M 


Thus, as in the special cases of (7.13.1) and (7.13.2), each smoothed value 
of fo is the average of the 2M + 1 values centered about So. 
For n = 2 or n = 3, reference to (7.11.10) gives 


M+1 


Yo = Ao + a2p2(0,2M) = a — Im —1 ae (7.18.7) 
M 
1 
where ao = aM +1 > tr (7.13.8) 
r=—M 
M 
and a, = a ee [3r? — M(M +1 
5 MOLD + DIf, (7.13.9) 
7 —-M 
with Y= p3(r,2M). (7.13.10) 
r=—M : 


The calculation of y2 can be effected by the summation technique out- 
lined in §7.9 [following (7.9.11)], in the form 


aie (2M + 1)(2M + 2)(2M + 3) 
10M(2M — 1) ; 


and the insertion of (7.13.8), (7.13.9), and (7.13.11) into (7.13.7) leads 
immediately to the required formula 





(7.13.11) 


aA,'. 
3 
vo” Ge ED > [((3M? + 83M — 1) —5r°]f,, (7.13.12) 


—M 
which specializes to the central formulas of (7.13.3) and (7.13.4) when 
M = 2and M = 3, respectively. 

A similar analysis leads to the central smoothing formula 


M 
15 
"= 44M? = 1)GM? — 0) QM +5) » [(15M* + 30M* — 35M? 


rem M 
— 50M + 12) — 35(2M* + 2M — 8)r* + 63r4]f,, (7.18.18) 
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relevant to fourth- or fifth-degree least-square approximation using 2M + 1 
points, which specializes to the central formula of (7.13.5) when M = 3.f 

As was pointed out earlier, the central smoothing formulas alone are 
generally useful only for smoothing values at points at least M intervals 
distant from the ends of the range of tabulation. However, they can be 
used throughout the entire range in the special cases when the true func- 
tion is known to vanish outside the range of tabulation and to tend to 
zero smoothly as the ends of the range are approached from the interior, 
so that the zero values at exterior points can be used in smoothing values 
at interior points near the ends. 

7.14. Supplementary References. For comprehensive treatments of 
orthogonal polynomials, see Jackson [124] and Szego [220]. See also the 
bibliography of Shohat, Hille, and Walsh [212]. Least-squares tech- 
niques, and associated error analyses, are considered by Aitken [24], 
Birge and Weinberg [42], Lewis [133], Guest [99, 100], and Hayes and 
Vickers [108]. See also texts such as Fisher [78], Scarborough [17], and 
Whittaker and Robinson [20]. The orthogonal (Gram) polynomials rel- 
ative to a unit weighting function over a discrete set of points are tab- 
ulated by Anderson and Houseman [243] and by De Lury [248]. For 
additional smoothing techniques, see Spencer [215], Rhodes [194], Whit- 
taker and Robinson [20], Wolfenden [239], Sard [206], Doodson [70], 
Lanczos [131], Schoenberg [209], and Wiener [236]. 


PROBLEMS 
Section 7.2 


1. Show that the functions ¢o(z) = 1 and ¢:(x) = 2 are orthogonal under integra- 
tion over (—1,1), and obtain the linear least-squares approximation yi(z) to a given 
function f(x) over (—1,1), 


f(x) ~ yi(z) = ao + mit (-l S21), 


for which 
L 
ies (f — yi)? dx = min, 
in the form 


y(t) =F Te, (1 + 32t)f(é) dt. 


Show also that the corresponding RMS error in (—1,1) is given by 
14 
[4 [Zi rae - af - gat f 


2. Show that the functions ¢o(x) = 1 and ¢1(x) = 2 are orthogonal under sum 
tion over the abscissas zo = —1, 21 = 0, and a, = 1, and obtain the linear le 


+ The central formulas (7.13.12) and (7.13.13) are written out explicitly for Mg. 
in Whittaker and Robinson [20]. 
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squares approximation y2(xr) to a given function f(x) over (— 1,1), 


f@) ~ yz) = Act Aw (-1 e281), 


for which 
2 
[f(wx) — ye(xx)]? = min, 
. k=0 
in the form 


yo(z) = gl(2 — 3x)f(—1) + 2f(0) + (2 + 32)f(1)). 


Show also that the corresponding RMS error over the three relevant points is given by 


[y [f(ax)]? — AR — ae 


3. If y:(z) is the linear approximation to f(z) obtained in P i 
; ? ob. 1 "(a)i 
continuous in (—1,1), show that . pene ee 


fe) — wie) =f" ute,9"@) ds, 


where gi(z,s) = (« —s), —4 ba (t — s)4(1 + 32t) dt 
= “FL +8)(1 — 22 +52)  (¢ 52), 
—q(1 — s)2(1 + 2x + sz) (s 2 =). 


Show also that gi(z,s) is of constant sign f i i i 
; gn for x and sin (—1,1) if and only if <i 
|x| = 1, and establish the relation ” conaes 


fe) = yz) — 31 — 824") (la| S For |x| = 1, 


where —1 < & < 1, showing, in particular, that 


(-Y) — wl) = sf"), $0) — n 0) = —SF"(&), — f) — i) = AYE). 


4. If y2(x) is the linear approximation to f(z) obtained in Prob i i 
tinuous in (—1,1), show that m Pla pe 


Hla) — wale) = f° galas") ds 
where g2(z,8) = (« — 8), — 3(—s), — (1 — 8)(2 + 32). 


Show also that go(z,s) is of constant si i i i 
gn for x and s in (—1,1) if and only if <2 
\v| = 1, and establish the relation , ea Es 


S(@) — yrx(a) = —G(2 — 3x%)f""(E) 
where -1 <é <1, 


6. If ys(x) is the second-degree polynomial which agrees exactly with f(z) when 
«= —1, 0, and 1, and if y2(x) is the linear approximation of Prob. 2, show that 


yale) = ya(x) + 4(2 — 82%) 64f(0), 


(je| S $ or |z| = 1), 


whore 4 is the central-difference operator with unit spacing. In particular, show that 


M1) = va(=1) @ = 91/0) = we) = SC) = (1) = ho40), 
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that the RMS error over the three points = —1, 0, and 1 is 


1 
V3 |6%(0)|, 


and that yo(x) = f(x) + (2 — 32*)f" (a), 


for all values of 2, if f(z) is a polynomial of degree two or less. 


Section 7.3 

6. If the right-hand member of the rth normal equation associated with (7.3.4) is 
denoted by », (r = 0, 1, . . . , m), show that the weighted sum of the squares of the 
N + 1 residuals is given by 


n 


S w(axi)[f (2s) ]? — y A,Vr, 


7=0 r=0 


and use this relation to calculate that sum for the numerical example of §7.3. 
7. Suppose that the following empirical data are available: 


x 1.36 1.49 1.73 1.81 1.95 2.16 2.28 2.48 


14.094 15.069 16.844 17.378 18.435 19.949 20.963 22.495 











F(a) 


Determine least-squares polynomial approximations yi (x) and y2(x) of degrees one and 
two, respectively, weighting all data equally, and calculate the RMS value of the eight 


residuals in both cases. 

8. Obtain estimated values of the RMS deviation between the unknown true fune- 
tion f(x) and the observed function f(x) in Prob. 7, based on the approximations y1(«) 
and y2(z), and also determine the approximate RMS errors in the calculated coeffi- 
cients involved in those approximations. Would either (or both) of the approxi 
tions be acceptable if it were known, independently, that the RMS value of t 


observational errors is about 0.04? 
Section 7.4 
9. With the notation of §7.4, show that 
b 
1 fi Ue@voe) de 
a 


a= FI b 
rlA, . U,(ax) dx 
a 





if f) = d*f/dz’ exists everywhere in (a,b) and is continuous, 
10. If w(x) = (x — a)@(b — 2), verify that 


U-(a) = C(x — a)rte(b — x)rt? 


satisfies the conditions of §7.4, when C, is a constant, ifa > —land sp > —1. 
11. If w(z) = 2 and (a,b) = (0,1), show that the rth orthogonal polynomial 
given by 


dole) = Cram fartX( — 2) 
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and that the arbitrary normalization ¢,(0) = 1 requires that 
1 
Cc, = ——;: 
(r+ 1)! 
Determine the polynomials of degrees zero through four, and prove that 


AEST RE 7 1 
[FFD 86 +1 


in consequence of the relation 
T(p)T@). 
Tp +q) 


12. Use the results of Prob. 11 to show that th 
. , . e nth-degree least-squares polynomial 
approximation to f(x) over (0,1), relevant to the weighting function w(2) = 2, is 
? 


1 
if zP\(1 — z)@1de = 


defined by 
y(x) _ y ardor (X), 
r=0 
where a, = 2r + 1)8 he f(x) be(2) dee, 


and where ¢,(z) is defined i i i i 
eee yt see edin Prob. 11. In particular, show that the linear approxima- 


1 
y(t) = 6 I, [(8t — 42) — 2(2¢ — 3t*)a] f(t) dt. 
Section 7.5 


18. By expanding (x? — 1)? in descendin i 
i g powers of x?, and appropriately di - 
tiating term by term, show that (7.5.6) implies the relation a cg 





ee (2r — 2k)! 
P,(x) Dy ( 1)* Qkl(r = k)\(r a 2k)! cil 


where the series terminat = ses 3 . 
odd. ates when k = r/2 if r is even and when k = (r — 1)/2 if r is 


14. Show that the coefficient of P,(x) in (7.5.11) can be expressed in the form 
2r+1f1 
+= Ory f_,  — 27) de 

if (© (@) exists everywhere in (—1,1) and is continuous. 


156. Show that the leading terms in the Legend i 
over (—1,1) are of the form gendre expansion of f(z) = cos (wx/2) 











we 62 10 
cos F = = Pole) — 2 (12 — w)Pa(a) + 2B (wt — 1805? + 1680)Pu(z) — ++. 


16. Compare numerically the approximations to f(x) 

= cos (rx/2) afforded by th 
lonst-squares polynomials of degrees two and four, obtained in Prob. 15 “wtih the 
approximating polynomials of corresponding degrees afforded by truncated power 
series and by fitting f(@) exactly at threo and at five equally spaced points in (—1,1). 
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17. If f(z) = [(w + 1)/2]3, show that the coefficient of P,(z) in the Legendre expan- 
sion of f(x) in (—1,1) is given by 


art 1 io (241M TT [yg yet 
a oN gee aay ee ae 








He 
= (-1)""" Gaya + 3) 
so that 





+i} 2 2 2 _2 ad SY 
(: : ) = 3 Pole) + pg Pi@) — gg Po) + 5-9 Pat) (jel < 1). 


18. Assuming the results of Prob. 17, compare the least-squares polynomial approxi- 
mations to f(x) = [(x + 1)/2]} of degrees two and four over (—1,1) with the corre- 
sponding results of truncating power series and with the polynomials of degrees two 
and four which agree with f(x) at three and at five equally spaced points in (—1,1). 

19. Obtain the expansion 


|x| = Pox) + $Pala) — PePae) + reePole) —- + (lel <0, 


and compare the approximations of degrees two and four with the corresponding 
polynomial approximations which agree exactly with f(x) at three and at five equally 
spaced points in (—1,1). 


Section 7.6 
20. By using Leibnitz’ formula (3.3.11), show that (7.6.8) implies the relation 


k=0 Ho 


21. Show that (7.6.15) can be expressed in the form 


a, = ae a xref (x) dx 


if f(a) exists for all ¢ 2 0 and is continuous, and if f(x) and its first r derivatives 
dominated by x~’~e%* as x—> «. Show also that 


ar = a ee (a > c) 
when f(z) = e°*, and that 
(-)1rs + DP 


* ahraore  uk 





when f(r) = 2°. ; ; 
22. If f(x) = [1 — (#/N)|* when 0 Sz s N and f(x) = O when « 2 N, obtain 
leading terms of the expansion (7.6.16), with a = 1, in the form 


2 N 


Noon At rad ate 
§@) = [1+ eh - aT + |... 





23. Show that the requirement that the best linear approximation to e%*/(«) in 
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constant, in the sense of (7.6.17), determines a in the form 

4 F(a) dx 

iA af (ax) dx 


In particular, show that the most appropriate choice of a for Prob. 22 (in this sense) 
is (N + 2)/N. 


a= 


Section 7.7 
24. Obtain from (7.7.6) the relations 


d 
ro H,(x) = 2rH,_1(2), 
d 


£ [od H(z) | = —2re-**H, (2) 


and deduce also that H,(x) satisfies the differential equation 
Hy’ — 2H} + 2rH, = 0. 
25. Use the first relation of Prob. 24, with the relation Ho(xz) = 1, to show that the 
coefficient of x" in H,(x) is 2". Also, by writing 


H,(a) = > pur 2k 
k=0 


in the differential equation of Prob. 24, show that 


(r — 2k)(r — 2k — 1) 
4(k +1) 


Aaky1 = 





Qk, 
and deduce that 


He) = Qz)— = 


1! 


— V(r — 2)(r — 


2! 8) @zyrt— 


D (gayra + 2 








where the series terminates with a multiple of z when r is odd and with a constant 
when r is even. 
26. Show that (7.7.13) can be written in the form 


alr | © 
a, = —— e @22F(r) (7) dx 
Qrr! Vr mene f ) 


if f(a) exists and is continuous for all z, and if f(x) and its first r derivatives are 
dominated by 2~e%*"* as a— + 0, 
27. By taking f(z) = e?* and a = 1 in the result of Prob. 26, obtain the expansion 


erm YF ite) 
k!} : 
k=0 
28. If f(x) = 1 — |a| when |x| S land f(x) = O when |z| 2 1, obtain the expansion 


Sia) = 7 2 oat | Hola) a] t=? Hylax) + 45 — 30a* + dat Hilaz) ++: ‘| 
wv 
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and show that, if « is chosen such that the coefficient of Hz vanishes, there follows 


f(a) = Bet HoV52) — rhoHV32) +] 


29. If the origin is chosen such that m; = 0, and if a? is then taken to be mo/(2mz2), 
with the notation of (7.7.17), show that the expansion (7.7.14) becomes 


a(2a?2ms — 1 4a’m, — 12a? 
f(x) = Saee| me + aQotms — 3m) Hy(ar) + “Aa'm — Mains + 8) F(a) 
bias | 





Section 7.8 
30. Obtain the expansion 





jel = [49 + fae) — 6 + (— DH gg ae) + «> -f 


when |z| < 1, and compare the approximations of degrees two and four with those 
obtained in Prob. 19. 
$1. Show that the function S,(x) defined by the relation 


Sila) = 25 Thys(@) 


is a polynomial of degree r, expressible in the form 





= 1 : 1.) — Sin (7 + 10 an sieess 
S,(a) = Vio sin [(r + 1) cos! a] = — se (@ = cos! z), 
that the polynomials So(x), Si(x),..., S,-(x), . . . are orthogonal over (—1,1) 


relative to the weighting function w(x) = +~/1 — x3, that A, = 2" and y, = 1/2, and 
that the coefficients in the approximation 


n 


fla) = ya) = > arS(x) (x <1) 
r=0 


are given by 
= 2 fl , VIF 1@)S,(a) de 
when the requirement 
ie V1 — 2? [f() — y(a)}* dx = min 
is imposed. , 
$2. Using the notation of Prob. 31, obtain the expansion 
jal = | $5002) + $Salz) — ++ + +(- Ge — ap py Sa) + +> «fs 


and compare the approximations of degrees two and four with those considered in 
Prob, 30, 
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Section 7.9 


33. Express the following sums in closed forms: 


n+l1 

(a) 1-24+2-34+.---4+nn+1) = y s®, 
s=2 

(6) 1-2-3 42-3-44---+4+nQn+1)(n 42). 


(ec) 1-24+4-547-8+.--- + (8n — 2)(3n — 1). 
34. Express the following sums in closed forms, and determine the limit of each 
asn— o: 





n—-1 
1 1 1 _ 7 
(@) ratgat:: taaep= >) * m3 
1 1 1 
By Ve ed eg 
Ora) aaa? °° Pea ee 
(c) I + 3 + 5 Boy cinta Cee Mg A 
1-2-3 2-3-4 3:°4-5 n(n + 1)(n + 2) 


35. Show that 
(—m)™ = (—1)%(m +n — DM 


when n is a positive integer. 


36. Show that 
n Pp n n + Pp . 


87. If m is a positive integer, show that 


OR en ee) 


and hence that 


and deduce the relation 


siMagin) = De: ee on™. 
m a 


N 


> 8 Aju, = [8 Arts — Ueqil¥e? 
s=M 


88. Show that 


and use this formula to obtain the result 
N 
1 
> sa" = Gp NeXt — (N + YH +a] (a #1), 


aeO 


by taking u. @ a*/(a — 1)" or otherwise, 
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39. Show that 


N 
Us Aivs = [us Ay us = (Ars) (At *0541) + (Ajus) ( 1 °Us+2) ey Pas 
s=M 
N 
+ (DAP Mare all? + (=D feadoese 
s=M 
and also that 
N 
Us Ais = [Us—1 AT 10s — (Arts—2) (AT~205) + (Ajus—s) (Avs) — + + - 
s=M 
N 
+ (=D ded + (=D Yh 
s=M 


Section 7.10 
40. Show that (7.10.13) can be written in the form 


N 
ye(N) = (—1)rr1 A, > U.(s +1, N), 
s=0 


where 4A, is the coefficient of s" in ¢;(s,N). 
41. Prove that, if w(s) = 0 for s = 0, 1, ..., N, then the polynomial ¢,(s,N) 
possesses r real zeros in the interval 0 < s < N. 


Section 7.11 


42. With the notation of Prob. 40, show that, when (7.11.9) is imposed on (7.11.6), 
there follows 
N 


N 
vi) = for(i = TS) @ +n +r -N - 0. 
s=0 s=0 


By making use of appropriate summations by parts, show further that 


N 
1 
y(N) = [NOP >, (s + r)@r 
s=0 
and deduce the closed form 


_ 1 N4+r41yern 
yr(N) = or + 1 Vo 4 





43. Use the results of Prob. 42 and of Eqs. (7.11.10) to express the leading terms of 
(7.11.12) in the form 





M M 
ee 3 
vO = om + id + Ml + DQM +1) 2, we 


M 
15t? — 5M(M + 1) 


+ MM + 1)@M = 1) QM + 1)QM +3) | Ne LON G.ne SA se SOU le 
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Section 7.12 


44. Prepare a table analogous to Table 7.1 in the case M = 3, when seven points 
are employed in the least-squares approximation, including only the orthogonal poly- 
nomials of degrees five and less. 

45. By using the table prepared in Prob. 44, obtain least-squares polynomial 
approximations of degrees one through five to f(t) =F(2zx — 8) for |é| S$ 3 from the 
following approximate data, calculate the respective smoothed values at the tabular 
points, and determine which approximation is probably most appropriate if the data 
are empirical, with errors having an estimated RMS value of about 0.07: 


z| 0.0 0.5 1.0 1.5 2.0 2.5 3.0 


F(x) | 15.564 18.059 20.548 23.554 26.348 29.498 32.830 





46. Use the results of Prob. 45 to obtain approximate values of the following 
quantities from the smoothed data: 


3 2.3 
F(0.1), F(1.8), F’(1.0), F’(1.3), i F(a) dz, i q F(a) dz. 


47. Obtain a formula analogous to (7.12.8) for fifth-degree seven-point least-squares 
approximation. 


Section 7.13 


48. Use (7.13.4) and (7.13.5) to obtain smoothed values of the data given in Prob. 
45, corresponding to third-degree and fifth-degree least squares, and verify that the 
results agree with those obtained in Prob. 45. 

49. The following data represent estimated world route mileages of scheduled air 
nervices in the years given, in units of 1000 miles. Calculate smoothed values, using 
both first- and third-degree five-point formulas, and plot the two smoothed curves 
together with points representing the given data. 


1919 3.2 1926 48.5 1933 200.3 
1920 9.7 1927 54.7 1934 223.1 
1921 12.4 1928 90.7 1935 278.2 
1922 16.0 1929 125.8 1936 305.2 
1923 16.1 1930 156.8 1937 333.5 
1924 20.3 1931 185.1 19388 349.1 
1925 34.0 1932 190.2 


Also use the two sets of smoothed data to obtain estimates of the annual rate of 
increase of mileage, at the end of the tabulation, to be used for long-range and short- 
range predictions, 





CHAPTER 8 


GAUSSIAN QUADRATURE AND RELATED TOPICS 


8.1. Introduction. The formulas given in Chaps. 3 and 5, for the pur- 
pose of numerical integration (with or without differences), each involve 
sets of ordinates which correspond to equally spaced abscissas. As might 
be expected, corresponding formulas which are generally capable of sup- 
plying comparable accuracy with fewer (about half as many) ordinates 
can be obtained by determining the optimal distribution of the abscissas, 
rather than prescribing them in an arbitrary way. It is found that the 
abscissas so determined are generally specified by irrational numbers and 
that the same is usually true of the weights by which the corresponding 
ordinates are to be multiplied. 

As a specific, but typical, example, which may be helpful in motivating 
some remarks with regard to such formulas, the five-point Newton-Cotes — 
formula (3.5.13), of closed type, is of the form 


1 
[Fe ae = aettH—1) + 8292) + 1240) + 329) + 70D] 


f'(é) 
— ina la 














when related to the interval (—1,1), whereas the Legendre-Gauss three- 
point formula, to be derived in §8.5, is of the form 


/ ¥ f(a) de = 5 Ei (- vi) + 8f(0) + 5f (%)| 4 TD), 


where & and 7 both lie somewhere in (—1,1). 

A comparison of the two error terms shows that the second formula, 
which requires the values of only three ordinates, may generally be 
expected to afford about the same accuracy as the first, which requires 
five ordinates, when the error terms are neglected. Also, since the 
weights are positive in both formulas, the error in the result, due to pos. 
sible errors in the ordinates, cannot exceed (but may equal) twice the 
maximum of those errors in both cases. Moreover, if random errors in 
the ordinates are considered, the corresponding RMS errors in the approx+ 

$12 





(8.1.2) 
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imations afforded by the first and second formulas are found to be given 
by about 0.48 and about 0.68 times the RMS ordinate error, respectively. 

Thus the apparent advantage of the second formula consists in the 
fact that, aside from the central ordinate, which is needed in both, it 
involves only half as many ordinates as the first. However, unless f(x) 
is a polynomial (in which case the formulas are not needed) the required 
ordinates are generally to be obtained by reference to a table of values of 
f(x). It is then often argued that, since two of the abscissas in (8.1.2) 
are irrational, interpolation involving at least two tabulated ordinates will 
be required for the determination of each of the two off-center ordinates, 
so that at least five ordinates will truly be involved in the use of (8.1.2). 
Thus, the apparent advantage is lost, and even reversed, since (8.1.1) 
involves the five ordinates needed in a simple and specific form. 

For this reason, and also because of the fact that the weights in most 
Gaussian formulas are also irrational (the present case is an exception), 
so that, in place of multiplying each ordinate by an integer, one must 
multiply it by a number with at least as many significant digits as are 
required in the final result, relatively little practical use has been made of 
such formulas. 

This situation is indeed unfortunate, since the second reason given, 
while an important one when calculations are necessarily effected by hand, 
slide rule, or use of tables of logarithms, is clearly of no significance when 
a computing device with even the relatively limited efficiency of a mod- 
ern desk calculator is available, and the argument supplying the first 
reason is (rather obviously) generally fallacious. Specifically, it assumes 
that the ordinates denoted as f(—1), f(—¥%), and so forth, are known or 
can be found directly in tables, without the need of interpolation, and is 
valid only then. 

It is true that available tables of many functions, such as e~*’ and J (2), 
for example, include these arguments, and these are typical of the func- 
tions which most frequently appear in textbooks dealing with numerical 
integration. But practical problems tend to deal instead with functions 
such as e~#Z* and Jo(az/L) over the interval (—Z,Z) and, correspond- 
ingly, with e-** and Jo(ax) over the normalized interval (—1,1), where 
« is a function of certain physical quantities and is most unlikely to have 
an integral (or rational) value. Thus, in practical situations, it is prob- 
able that each of the ordinates appearing in either of the forms (8.1.1) 
and (8.1.2) will have to be determined by interpolation (or by direct 
calculation), and the interpolation for Jo(a +~/15/5) would be more diffi- 
cult than that for Jo(a/2) only in that the determination of the numerical 
argument of the interpolate in the former case would involve a multiplica- 
tion of two n-digit numbers. The necessary accuracy of the interpolation 
would be no higher in one case than in the other, 
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Further, it may be noted that, when use is made of a large-scale digital 
computer, and when f(z) is defined analytically, values of the integrand 
usually are not obtained by interpolation in tables in any case, but are 
generated directly by ‘‘subroutines ” incorporated in the program. Here, 
since the machine does not distinguish between rational and irrational 
arguments, the approximate evaluation of f(~/15/5) is in no way more 
complicated than that of f(1/2). 

Thus, formulas such as (8.1.2) are indeed advantageous when the deter- 
mination of ordinates needed for the conventional formulas would involve 
either direct calculation, physical measurement, or interpolation and when 
the use of a “high-precision” formula is appropriate (see §3.7), and they 
are beginning to receive the attention they deserve. t 

The developments of this chapter relate these formulas to a method 
of ‘‘osculating interpolation,” associated with the name of Hermite, 
which is treated in §8.2, and to an associated quadrature formula (§8.3). 
Several of the classical quadrature formulas of the Gaussian type, in 
which no abscissas are arbitrarily preassigned, are considered, together 
with their error terms, in the subsequent sections, which depend upon 
certain results from Chap. 7. The following section ($8.10) deals with 
the modifications necessary when certain of the abscissas are preassigned, 
and the results are illustrated in the next two sections. Section 8.13 
deals with a special class of quadrature formulas in which the weights, 
rather than the abscissas, are preassigned, and §8.14 with algebraic 
methods for deriving quadrature formulas, of the type considered in this 
chapter, without making use of properties of orthogonal polynomials. 

8.2. Hermite Interpolation. The interpolation formulas so far con- 
sidered make use only of a certain number of values (approximate or 
exact) of the function to be approximated. Except in the case of the 
least-squares formulas of the preceding chapter, the interpolation poly- 
nomial y(z) has been defined as that polynomial of lowest degree which 
agrees with the approximated function f(x) at a certain discrete set of 
points. 

In certain cases, values of both f(x) and its derivative f(x) are avail- 
able, say, at m points.{ We next derive an interpolation formula which 


+ Perhaps because of the fact that they are particularly useful when the integrand is 
defined analytically, they are usually called quadrature formulas, whereas formulas of 
the usual type are usually called integration formulas. There is no basic distinction 
between the terms. ; 

+ In the preceding chapters, integration formulas were obtained by integrating the 
interpolation polynomial of degree n which agrees with the integrand at n + 1 points, 
so that the principal emphasis was on the degree n of that interpolation polynomial, 
and it was convenient to number the relevant n + 1 abscissas from 0 to n, On t 
other hand, the derivations of the integration formulas which are to be treated in t 
present chapter are based on certain properties of the polynomial whose geros are 
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utilizes these 2m data and, in the remainder of this chapter, show that the 
result leads to useful formulas for numerical integration which do not 
depend upon knowledge of values of f’(z). 

Before proceeding to these matters, however, it is desirable to review the 
Lagrangian interpolation formula treated in Chap. 3 and to write it in a 
slightly modified form. If the values of f(x) are known at the m points 





v= 21, X2, . . « Lm, the auxiliary functions 
a(x) = (e — 21)(@ — 4a) + > + (& — Bm) (8.2.1) 
and 
Ma) = Ga") 
= (*©— 241) °° . (@ — X-1)(@ — M41) + + + (@ — Am) 
(ti — 1) °° * (ee — Gua) — Ta) « * * Ge — te) 
(§=1,2,...,m) (8.2.2) 
are first defined, with the properties 
(xj) = 0, (8.2.3) 
and li(a;) = 55, (8.2.4) 


where 6;; is the Kronecker delta (zero when 7 ¥ 7 and unity when 7 = j). 
With these notations, the polynomial of degree m — 1 which takes on the 


values f(x1), f(x2), . . . , and f(«m) is expressible in the form 
ya) = ) k@fa). (8.2.5) 
k=1 


Also, if f(a) is continuous in the interval J limited by the largest and 
smallest of the m + 1 numbers 2, %, . . . , Um, and a, the error 


| E(x) = f(z) — y(z) 
is expressible in the form 


E(a) = 17 52), 


= (8.2.6) 
where & is somewhere in J. 

Now suppose that values of both f(x) and f’(x) are known for a, ... , 
wv». Since a polynomial of degree 2m — 1 is specified by 2m parameters, 
it is plausible that one such polynomial y(a) can be determined in such a 
way that y(a) and f(x) possess the same value and the same derivative at 


each of these m points. We next attempt to determine such a polyno- 








wbacissas of the points involved in the integration formula. It is thus more convenient 
to use a new symbol, say m, to represent the degree of that polynomial, and hence also 
to represent the number of ordinates employed, and to number the ordinates from 
1 tom 
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mial by assuming that it is expressible in the form 


























y@) = ) m(afe) + Y m@s'er), (8.2.7) 
R= k=1 
where hj(x) and h;(x) (i = 1, 2, . . 
degree 2m — 1, to be determined. 
The requirement that y(a,;) = f(2;) clearly will be satisfied if 


. , m) are polynomials of maximum 


hi(xj) = 6:3, — hi(a;) = 0, (8.2.8) 
whereas the requirement y’(x;) = f’(x;) will be satisfied if 
hi(x;) = 0, hi(aj) = 8y, (8.2.9) 


for 1 SiSmand1l1sSj<m. Now, since |,(x) is a polynomial of 
degree m — 1 which satisfies (8.2.4), the function [l;(x)]? is a polynomial 
of degree 2m — 2 which satisfies (8.2.4) and whose derivative vanishes 
at 2; when i # j. Hence, since h,(x) and h;(x) are polynomials of degree 
2m — 1, there must follow 


hia) = r(a)[L(a)P, h(a) = si(a)[L(a)P, (8.2.10) 


where 7;(x) and s;(x) are linear functions of x, in order that the first con- 
dition of (8.2.9) and the second condition of (8.2.8) be satisfied. The 
other two conditions then give 


r(a;) = 1, 
and 8:(X;) = 0, 


ri(as) + 2(a) = 0 
8; (a) = 1, 


(8.2.11) 
(8.2.12) 


from which there follows 
r(x) = 1 — 2li(a,;)(x — x), (8.2.13) 


Hence, by combining (8.2.7), (8.2.10), and (8.2.13), we obtain the 
desired polynomial in the form 


$:(x) =2— %. 


y(e) = ¥ hela)flar) + Y hes"), (8.2.14) 

k=1 k=1 
where h(x) = [1 — 2li(x:)(x — 2,))[Li(x)]? (8.2.15) 
and hi(z) = (« — ai[l(z)}. (8.2.16) 


This result is known as Hermite’s interpolation formula or, frequently, as 
the formula for osculating interpolation (for a more general formula, 
which also uses values of higher derivatives of f(x), see Fort [82]). 

An expression for the error E(x) = f(x) — y(«) can be obtained by 
method similar to that used in §2.6. Thus we notice that both B(# 
and [r(x)]? vanish together with their first derivatives at each of the m poin 
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t= 21, ... ,2m. We then form a linear combination of these functions, 


F(x) = f(x) — y(a) — K{[x(z))?, (8.2.17) 


which therefore has the same properties, and determine K in such a way 
that F(x) also vanishes at an arbitrarily chosen additional point x = &. 

Now let I represent the closed interval limited by the smallest and 
largest of the numbers 2, 22, . . . , 2m, and %. Since F(x) vanishes at 
these m + 1 distinct points, F’(x) must vanish at at least m intermediate 
points inside 7. But since F’(x) also vanishes at the m points 21, ..., 
tm, it vanishes at least 2m times in J. Thus F’’(x) vanishes at least 
2m — 1 times inside I, F’’’(x) at least 2m — 2 times, . . . , and hence, 
finally, FC» (x) vanishes at least once inside Z, assuming the continuity 
of the derivatives considered. Let one such point be ~. Then, recalling 
that y(x) is a polynomial of degree 2m — 1, and hence that yer (x) = 0, 
we obtain from (8.2.17) the result 


0 = Fem (h) = fOm(B — K- (2m)! 
_ f~)(8) 
or se 75) 





Thus, since F(Z) = 0, there follows 


(2m) 
B@) = 1@) — v@) - © wer, 
where is somewhere in [. Since both sides of this relation vanish when 
# is identified with one of the points z,, the relation is true also for such 
values of %, and hence for any Z Hence, by suppressing the bars, we 
deduce that the error associated with approximating f(z) by the right- 
hand member of (8.2.14) is of the form 


Ea) =F" ® ep, 


im) (8.2.18) 


where ~ is somewhere in the interval J. In particular, we may deduce 
easily that the polynomial (8.2.14) is the only one having the desired 
properties. 

Thus the Hermite m-point formula yields exact results when f(«) is 
identified with any polynomial of degree not exceeding 2m — 1. 

8.3. Hermite Quadrature. From the Hermite interpolation formula 
we may deduce the formula 


[Prceyta) de = ¥ mga) +} agree + 2, 


hel hel 


(8.3.1) 
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with the weighting coefficients defined by the equations 


H; = [, w@nca) dx = / ” w(a)[1 — U(x) (ae — x) I[l(2)}2 dx (8.3.2) 


a 


and B= ie w(x)hi(x) dx = f w(x)(x — x)[li(x)2 dz, (8.3.3) 


and with the error expressible in the form 


b b 
E= [ we E@) de = Gayt ik fF (t)w(a) (a)? dx, (8.3.4) 


where a < — < b if the points 1, we, . . . , m lie in that interval. The 
result of neglecting the error term is called the Hermite quadrature 
formula. 

If the weighting function w(x) is nonnegative in (a,b), 


w(x) 2 0, (8.3.5) 


as will be assumed throughout this chapter, the coefficient of f°?” (£) in the 
integrand of (8.3.4) is nonnegative. Hence the second law of the mean 
may be invoked to permit (8.3.4) to be written in the more convenient 
form 





(2m) b 
= FEO | wa) tna) ae (8.3.6) 

These results may be compared with the result which corresponds to 
Lagrangian interpolation employing m points, which can be expressed in 
the form 


b m 
i w(x)f(x) dx = » Wif (xx) + E, (8.3.7) 

“ k=1 

6 
where W;= / w(a)l;(x) dx (8.3.8) 
b 

and fe - / f()w(2)m(a) de. (8.3.9) 
Since 7(x) changes sign at each of the points 21, . . . , Xm, the law of the 


mean cannot be applied directly to (8.3.9) to produce a form analogous to 
(8.3.6). 

If a quadrature formula yields exact results when f(x) is an arbitrary 
polynomial of degree r or less, but fails to give exact results for at least 
one polynomial of degree r + 1, it is said to possess a degree of precision 
equal to r (see §5.11). From the linearity of the process, it follows that 
this situation exists if and only if exact results are afforded for 1, «, .2*, 

. , v, but not for a, 
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From (8.3.6) we see that the degree of precision of the Hermite m-point 
formula is exactly 2m — 1. It follows also from (8.3.9) that the degree 
of precision of the Lagrangian quadrature formula, based on m points, is 
at least m — 1. Furthermore, if we take f(x) = [m(a)]?, we see that all 
terms in the sum involved in (8.3.7) vanish, and hence, for this function, 
the Lagrangian formula would give 


ic w(x)[r(x)P dx = 0. 


Under the assumption (8.3.5), this situation is impossible. Hence, since 
[r(x)]? is a polynomial of degree 2m, it follows that the degree of precision 
of the Lagrangian m-point formula cannot exceed 2m — 1. Unless fur- 
ther information concerning the choice of the points %1, . . . , 2m is avail- 
able, no more specific statement can be made. However, it is shown in 
the following section that there exists a class of formulas of the simple 
Lagrangian type (8.3.7) which actually have the maximum degree of 
precision 2m — 1. 

8.4. Gaussian Quadrature. An inspection of (8.3.1) shows that, if the 
points 21, . . . , %m can be chosen in such a way that the weighting coeffi- 
cients H;, associated with the derivative terms vanish, then the Hermite 
formula will reduce to a formula of the simple type (8.3.7) while retaining 
the degree of precision 2m — 1. With the notation of (8.2.1) and (8.2.2), 
the definition (8.3.3) can be expressed in the equivalent form 


1 2 
Hy = x’ (a) i) w(x) (a)li(x) de, (8.4.1) 


where, as before, 
(a) = (& — a1)(u — %2) ++ + (@ — Om), (8.4.2) 


so that a1, . . . , Um are the m zeros of x(x). 

Thus A; will vanish for 1 < 7 S m, and the degree of precision 2m — 1 
will be preserved, if r(x) is orthogonal to I(x), . . . , Im(x) over (a,b), 
relative to the weighting function w(x). Since each /;(x) is a polynomial 
of degree m — 1, in virtue of (8.2.2), a sufficient condition is that r(x) be 
orthogonal to all polynomials of degree inferior to m over (a,b), relative 
to w(2). 

This condition is also necessary. To see this, assume that 


A;=0 (1sis<m, (8.4.3) 


and that the formula has a degree of precision 2m — 1. Let f(x) bea 
polynomial, of degree 2m — 1 or less, expressed in the special form 


S(@) = w(a)Umi(a), (8.4.4) 
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where Um—1(x) is an arbitrary polynomial of degree m — 1 or less. Then, 
since x(x;) = 0 for 1 S 7 S m, there follows f(x;) = 0, and hence, for this 
polynomial, (8.3.1) becomes 


L w(x)f(x) dx = f W(x)m(2)Um—1(x) dx = 0, (8.4.5) 


as was to be shown, since H; = 0 by assumption and E = 0 by virtue 
of the fact that here f°” (x) = 0. 

Hence we deduce that if and only if the polynomial x(x), of degree m, is 
orthogonal to all polynomials of inferior degree over (a,b), relative to w(x), 
the Hermite quadrature formula reduces to the formula 





b m 
/ w(a)f(a) dx = y Ay f(a.) + E, (8.4.6) 
7 k=1 
(2m) b 
kee p=! Gar | w(2)la(o)F ae, (8.4.7) 
and where the m abscissas %1, . . . , %m are the zeros of r(x). 


A formula of this type is usually called a Gaussian quadrature formula, 
although it appears that only the case in which w(x) = 1 was explicitly 
considered by Gauss. 

Since (8.4.6) is a special case of both (8.3.1) and (8.3.7), the weighting 
coefficients H; and W; given by (8.3.2) and (8.3.8) must be equal in this 
case. Thus we may write 


Hy= ie ® w(x) (L(a)|? de = i w(x)l;(x) dx = W; (8.4.8) 


the first form being obtained from (8.3.2) by writing that formula in the 
form 


Hos be w(2)[l(x)]? dx — 2U (2) Ai (8.4.9) 


and recalling that here H; = 0. 

The polynomial x(x) is precisely that numerical multiple of the poly- 
nomial ¢n(x), specified by Eqs. (7.4.4) to (7.4.7), for which the coefficient 
of the leading power of x is unity. Thus, as was shown in §7.4, its m 
zeros are indeed real and distinct and are all located inside the interval 
(a,b)... The interval need not be of finite length, so long as w(x) 2 0 and 


the integral i, > x'w(x) dx exists for all nonnegative integral values of k, 


It is of particular importance to notice that, in virtue of the first form 
of (8.4.8), the weighting coefficients in a Gaussian quadrature formula are 
all positive. (ej 
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With the notation of §7.4, the error (8.4.7) can be expressed in the form 


(2m) 
= i om ; (8.4.10) 

where Ym is the normalizing factor corresponding to ¢,,(x) and is defined 
by (7.4.13), and where A,, is the coefficient of x” in ¢m(x).T 

In order to determine explicitly the weights H; defined by (8.4.8), we 
first establish certain important additional properties of the relevant 
orthogonal polynomials ¢,(2). 

We first notice that 





dus(a) — AB apa(x) 


is a polynomial of maximum degree k. Hence, if we write 





— Agus 

k= A; ’ (8.4.11) 

it follows that $:41(x) — a,v¢,(x) can be expressed as a linear combination 
of do(x), oi(z), . . . , de (x), in the form 


der i(t) — aeede(x) = bede(x) + Cabea(t) +--+ *, (8.4.12) 
for some constant values of b;, cx, .... But, since rdo(x), xoi(2), 
. , and x¢,x_2(x) are polynomials of degree inferior to k, the two 
terms in the left-hand member of (8.4.12) are both orthogonal to ¢o, 
fi, - + + , x2 Over (a,b), relative to w(x). Hence the same statement 
applies to the right-hand member, so that the omitted terms in (8.4.12) 
vanish, and we deduce that ¢;(x) satisfies a recurrence formula of the form 
depi(z) = (ane + dx) bu) + cude—s(2), (8.4.13) 
where a, is defined by (8.4.11), and 6, and c, are certain other constants. 
lMquations (7.5.9), (7.6.12), and (7.7.10) serve as illustrations of this fact. 
In order that (8.4.13) also hold when k = 0, the convention ¢_;(z) = 0 
may be adopted. 
If we multiply the equal members of (8.4.13) successively by woe41, 
wox, and wdx-1, and integrate each resultant equation over (a,b), we 
obtain the additional relations 


ress = a [ 20(@) br(2) du4a(z) de, 
0 = a f° ew(a)[o.(a)]? de + ber, 
0 = a [ cw(x) drs(x)ou(0) de + ores. 
| Whereas #»(«) could always be so defined that either ym or Am is unity, this choice 


usually does not lead to a standard (tabulated) form, Hence, the formulas are given 
without such a restriction, 
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If k is replaced by k — 1 in the first equation, the result can be used to 
eliminate the unknown integral from the third equation and to establish 
the relation 


qe (8.4.14) 
Ak—-1Vk-1 


Hence (8.4.13) can be rewritten in the form 


1 GH) _ desi(t) , dena) _ debe), (8.4.15) 


Vk akYk Ak—-1Vk-1 AbYVk 





Next, if both members of (8.4.15) are multiplied by ¢:(y), where y is 
an arbitrary parameter, and the result of interchanging x and y in the 
result is subtracted from that result, the constant 6, is eliminated, and the 
more symmetrical relation 


oy P(t) bi(y) _ urr(Z) ely) — bx (©) desily) 
Oe eis 
P(X) be-1(y) — Pe-1(X) dey) (8.4.16) 


Ak—1Vk-1 








is obtained. The result of summing the equal members from k = 0 to 
k = mand taking advantage of the ‘‘telescoping”’ of terms on the right, 
is then the important relation 





bx (X) x (y) = dm4i(2) bm(y) = bm(a) dm+ily) (8.4 17) 
Yk On¥m(z — ¥) ae 
k=0 
known as the Christoffel-Darboux identity. 
Now, if we notice that 
om(a) = Anm(x), (8.4.18) 


and identify y with 2;, where x; is a zero of x(x), so that ¢n(x:) = 0, 
Eq. (8.4.17) specializes to the form 


we) ) u(x) aCe). (8.4.19) 
Vk 
k=0 


AnmYm %— XY 


The result of multiplying the equal members of (8.4.19) by w(x)¢o(z), 
integrating the results over (a,b), and making use of the orthogonality of 
the polynomials, relative to w(x), is then 


b 
fra) w(x) or(2) onl) dx = — $o(2,) 
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or, since ¢0(%) is a constant, 


[oe f@ az = — _amYm (8.4.20 
a Z—-X Pm+1(Xi) oie) 
Finally, since 


La) = —7@) on(2) 














ae — m) Gelade — a)’ vee 
reference to the second form of (8.4.8) leads to the desired result 
ae bm(2) A 
H; = —_ m Bey Ss m+1Ym . 
stay |, 2) SoS ae = — gage 20) 
Alternatively, since (8.4.13) and (8.4.14) give 
AmpiAm1 Ym 
Gm41(%i) = > sea © age ee dom—1(tr), 
this result may be expressed in the equivalent form 
AmYm— 
H; = sat (8.4.23) 


A m1 h(i) bm—1 (i) 


Many other useful results can be deduced from the identity (8.4.17). 
In particular, by considering the limiting form of that relation as y > z, 
we obtain the equation 


m 


a 2 Am ; ; 
. a = Anrryn, Pmti) om(2) — bm(t)om41(a)], (8.4.24) 





k=0 


and the result of setting « = x,;, where 2; is a zero of ¢(x), is the curious 
relationship 





SG. oAlelieanGe 4 
Leet Ne aa (8.4.25) 


Ansi¥m 
Pre Vk +1Y 

8.5. Legendre-Gauss Quadrature. In the case when a constant 
weighting function is to be used over a finite interval, it is convenient to 
suppose that a suitable change of variables has transformed that interval 
into the interval (—1,1). From the results of §7.5, we then have 


1 
(x) = A, Pn); (8.5.1) 
where P,,(x) is the mth Legendre polynomial, and where 
™ (2m)! 
Am am (mi)® (8.5.2) 
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The abscissas and weights corresponding to formulas for which 2 Ss 


With the additional result m S 5 are listed, to six digits, in Table 8.1. More elaborate tabulations 

































Yn = ey (8.5.3) are listed in the references. 
m TaBLe 8.1 
Eqs. (8.4.6), (8.4.22) or (8.4.23), and (8.4.10) reduce to the quadrature m Abscissas Weights 
formula 2 2 | +0.577350 1 
[i s@ dz = ¥ fm) + £, (8.5.4) 3 0 8 
-1 Pea +0.774597 Z 
4 +0.339981 0.652145 
where 2; is the ith zero of P,,(x), and where +0.861136 | 0.347855 
2 5 0 0.568889 
we< 2 ie ; (8.5.5) +0.538469 | 0.478629 
i (m + 1)Pmyi(ei)Pi (ei) — MP m—1(ai)P 7s) +0.906180 | 0.236927 
oa E= 22m+1(m!)4 fom (E). (8.5.6) sk 
(2m + 1)[(2m)!}* 8.6. Laguerre-Gauss Quadrature. In the case when the weighting 
‘ function 
From the known relationt w(x) = e* (8.6.1) 
a ) Pi (2) = maP ola) + mPnale) is used over the semi-infinite interval (0, ), the results of §7.6 give 
—c - xy= m m— 
1 
= (m + 1)Pn(x) — (m + 1)Pm4i(z), (8.5.7) w(t) = G- Lm(z), (8.6.2) 


there follows also where Lm(x) is the mth Laguerre polynomial, and where 


(1 — 2?)P,(2i) = mPma(ai) = —(m + 1)Pnti(%i); Ag =: C= 1): (8.6.3) 
In addition, there follows 
so that (8.5.5) can also be expressed in the forms Ym = (m!)?, (8.6.4) 


and hence the formulas of §8.4 become 





2 2(1 — 2) . (8.5.8) 

A; = — x? i ;)]? is 21P mn 1a : 1a © i 

Nein i ae J, e*f(x) dx = > H,f (a) + E, (8.6.5) 
k=1 


In illustration, when m = 3, there follows 


x) 3 = (a? sy where 2; is the ith zero of L»(x),{ and where 








I 4/15 5 oz (m!)? — Ie I 6. 
andhenee om =— = 0, om = Ae Te \Emile) ~~ ELCeyEn aCe) ry 
2 

and H, = 3, Hy = 3; Hy; = %. and E = OS (8.6.7) 
Thus where 0 < &< ©. From the relation 

: 1 V15 (4)| , MQ) (9.5.9) WL (t) = MLm(t) — mL a(x) = (@ — m — 1)Lm(2) + Lnti(t), (8.6.8) 
[7 dz = 9 L«*(- “he + 8f(0) + Sf 5 if 15750 there follows also 

ee ii (i) = —M*Lm—r(2ti:) = Lmgr(2i), 
where ‘ 


| The more general case in which e=* is replaced by e7@* in (8.6.5) is clearly reduced 


en imi i tial rrence formulas 
t For derivations of (8.5.7), and of other similar differential reeu lo the present case by a simple change of variables. 


listed in this chapter, see Szego [220]. 
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so that (8.6.6) can also be expressed in the forms 





eC) eC iene 
* eZ, (as)? [Lm41(2s) ]? 
In illustration, when m = 2, there follows 


a(t) = L2(x) = a — 4x + 2, 


and hence a =2— v2, wm =2++/2 

2+ 7/2 H _2-V2 
and Ay a 
Thus 


[Fete ae = 2 + VIO - VB + @- VIA + VBI 
0 


+£@ (8.6.10) 


where 0 < & < © or, more generally, 


iz e*f (x) dx = a | 2 + V2)f 2-~) + (2— /2f C+)! 


+£@. (e611) 





The abscissas and weights corresponding to formulas for which 2 < 
m < 5 are listed, to six digits, in Table 8.2. Other tabulations are listed 
in the references. 














TABLE 8.2 
™m Abscissas Weights 
2 0.585786 0.853553 
3.414214 0.146447 
3 0.415775 0.711093 
2.294280 0.278518 
6. 289945 0.0103893 
4 0.322548 0.603154 
1.745761 0.357419 
4.536620 0.0388879 
9.395071 0.000539295 
5 0. 263560 0.521756 
1.413403 0.398667 
3.596426 0.0759424 


7.085810 0.00361176 
12, 640801 00000233700 
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When the weighting function (8.6.1) is generalized to the function 
w(x) = rhe-* (@ > —1), (8.6.12) 
it is easily found, by the methods of §7.6, that 


eee = 14 (2), (8.6.13) 


where Lf,(x) is the generalized Laguerre polynomial of degree m, 


Ik) = ea a (e-2gh+m), (8.6.14) 
and that 


An =(—-1)",  -_ym =m! i atime-= dx = mIT(m +8 +1). (8.6.15) 


It can also be shown that the differential recurrence formula 


whim (t) = mLp, (x) — m(m + B)L8 (a) = ( — m — B — 1)L8 + LA 


m+1 


(8.6.16) 
is satisfied. 


From these results, the corresponding quadrature formula is readily 
derived in the form 


I, ” pe-=f(x) dx = > Hyf(az) + E, (8.6.17) 
k=1 


where 2; is the ith zero of L4,(x), and where 


pikes mIT(m+8+1) _ mim + B64 1x; 








a Le (xi) ]? = (T4.. (x) (8.6.18) 
a os aa FD jeom (8, (8.6.19) 


A brief table of abscissas and weights (m,8 = 2,3,4) has been given by 
Burnett [52]. 


8.7. Hermite-Gauss Quadrature. In the case when the weighting 


function 
w(2) = e-? (8.7.1) 
is used over the interval (— 0,0), the results of §7.7 give 
A = Hy (2), (8.7.2) 


Where H,(x) is the mth Hermite polynomial, and where 


An = 2", (8.7.3) 
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In addition, there follows 


= Vr 2™mI, (8.7.4) 


so that the appropriate Gaussian formula is of the form 


m 





[l,71@ de = Y Wife) +B, (8.7.5) 
=o k=1 
where z; is the ith zero of Hm(x), and where 
ice amtim! ~/r - 2"(m — 1)! a/r (8.7.6) 
OA) Ams(@s) Hy (a) Hm aa) 
mV gam (8.7.7) 
and E= 2"(2m)! 9m(9m)1 Bi (é), 


for some & From the relation 
H! (a) = 2mHm—-1(%) = 2H (x) — An4i(x), (8.7.8) 
there follows also 
H!(x;) = 2mHn—1(xi) = —Hm4r(xi), 
so that (8.7.6) can also be expressed in the forms 


antl af _ 2e+tm! Vr (8.7.9) 
[H;, (ai) ]? [Hm41(xi)]? 


In illustration, when m = 8, there follows 


a(x) = 3Hs(2) = x(@* — 3), 





H; = 








V/6 
and hence wu=- vs Ly = 0, t= 
T NAG = Vr 
and Hie H, = ve Hs = ~—— 
Thus 
‘ VEO 


(8.7.10 
or, more generally, 


[ - ewe (a) de = a [s(- ys) + 45(0) + 1(¥8)|+ var@ “i 


The abscissas and weights corresponding to formulas for which 2 
m < 5 are listed, to six digits, in Table 8.3. More extensive tabulati 


are listed in the references, 
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TABLE 8.3 


m Abscissas Weights 
+0.707107 0.886227 


3 0 1.181636 
+1.224745 0.295409 


4 +0.524648 0.804914 
+1.650680 0.0813128 





bo 








5 0 0.945309 
+0.958572 0.393619 
+2 .020183 0.0199532 


It should be noticed that no restrictions are imposed on ¢ in the error 
formula (8.7.7), other than that it be real. (Similarly, in the error for- 
mula of the preceding section, it is known only that £ is real and positive. ) 
Thus, in those cases when f(” (x) varies greatly in magnitude when m 
is iacee: as x takes on all real values, the imprecision associated with the 
use of (8.7.7) is correspondingly great. 

For example, if f(x) = 1/(1 + 22), it is easily seen from the Maclaurin 
expansion of f(x) that f@™ (x) = (2m)! when x = 0, so that, in the evalua- 
tion of the integral 


—a2 v 
ss i 1+ 2? 

by m-point Hermite quadrature, the error formula (8.7.7) would admit 
the possibility of an error as large as ~/x m!/2”, if the appropriate (but 
unknown) value of &, were near zero. If this were indeed the case, the 
error would increase rapidly with m, when m > 2. However, it has een 
pointed out by Rosser [197] that the errors corresponding to the use of 
2, 10, and 16 points are about 0.16, 0.0016, and 0.00016, respectively. 
‘Thus the error truly decreases with increasing m, so that £, must increase 
rapidly in magnitude as m increases, at least form < 16, but the rate of 
convergence is extremely slow. Similar slow convergence (or divergence) 
may be expected, more generally, whenever f(x) possesses singularities, 
in the finite part of the complex plane, which are fairly near the real axis 
(compare §3.7). 

In this connection, it may be noticed that, in view of the appropriate 
generalization of (5.11.34), the truncation error in the general Gaussian 
(juadrature formula (8.4.6) can be expressed in the form 


= 1.34329 -- - 





B= [walla fler,eraeaieey . . . ,m,tmyt] de 
+ simmyEil f° w(a)fer(x)}? dx 
- 4 S[rr,@r,@ aya, 6s 6 yXmyimy€rl, (8,7,12) 


¥ JS[%1,01,%2,22, . 
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with the notation used in (8.4.10), where a < £1 < b. This form reduces 
to (8.4.10) when the divided difference is replaced by f°” (&)/(2m)!, 
where a < & <b. Since & and & generally cannot be estimated, one 
generally must replace either the divided difference or the derivative by 
its maximum absolute value for all in (a,b), to obtain an upper bound on 
\E|, and it may happen that the bound obtained from (8.7.12) is much 
less conservative than that obtained from (8.4.10). 

In illustration, if f(x) = 1/(x + 4), there follows 








1 
flxi,21, se Lm Lmy Et] => [(a cs 71) ae (a + Lm) |2(a + £1) 
f°” (2) = 1 
an (@m)! ~ (a+ &)F 


Thus, for example, if five-point Laguerre-Gauss quadrature were to be 
used to approximate the integral 


oo ent 
i eh 1; 


the truncation-error terms corresponding to the use of (8.7.12) and (8.6.7) 
would be of the forms 


(120)? _, 0.0060 1.44 X 104 
@239x10)a+& Ith (1 + &)’ 


respectively, where the abscissas are taken from Table 8.2 and where && 
and £ are known only to be positive. Accordingly, the use of (8.7.12) 
here permits the determination of an error bound which is smaller tha 
that obtainable from (8.6.7) in a ratio of about 2.4 X 10° The actu 
truncation error rounds to 0.0013. 

Whereas this case is a rather extreme one, still, when f(x) is such tha 
an upper bound on the magnitude of the relevant divided difference cam 
be obtained practically, the use of (8.7.12) is usually preferable to tha’ 
of (8.4.10). 

8.8. Chebyshev-Gauss Quadrature. 


and 





For the weighting function 
w(x) = : 

SS 

V1 — 2 


over the interval (—1,1), the results of §7.8 give 


(8.8.1 


r(z) = G- Tm(2), 
where Tm(z) is the mth Chebyshev polynomial. 


results 


AS = Qr-1, ah 3 
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the relevant Gaussian formula is obtained in the form 








1. f@) . 
k=1 
where «x; is the 7th zero of 7’,(x), and where 
| 2 a gn ee eect dee 
Ty, (4) Tm+1(2:) ee) 
and gaan m 
i= Fm (Qmyit” 1S); (8.8.6) 
where |é| < 1. 
Since 
T(z) = cos (m cos 2), (8.8.7) 
the abscissas are obtainable in the explicit form 
eed (2¢ — 1)r : 
x; = cos | = De @@=1,2,...,m). (8.8.8) 
Also, direct calculation shows that 
roy — (-Ditm 
Ti, (ti) = Tigi we Tm+1(ti:) = (—1)' sin a, (8.8.9) 
where ate 
i am ™ (8.8.10) 
and hence (8.8.5) reduces to the remarkably simple form 
TT 


Thus the weights in (8.8.4) are all equal. 
lhe formula (8.8.4) hence can be written in the explicit form 





' f(z) eX 2k — 1 
l————— => 2 
, 1 V/1 — x da m 5 (cos am *) p zim omy SO” (8 (8.8.12) 
k= 
where |é| < 1. 


: ie Jacobi-Gauss Quadrature. Most of the other Gaussian quadra- 
ure formulas which have been investigated in the literature correspond 
lo the use of a specialization of the weighting function 


w(x) =(1—a)(1+2)@ (@>--1,8>—-1), (8.9.1) 


over the interval (—1,1), or to the result of transforming this problem to 
the interval (0,1). The special cases a = B = Oanda = 6 = —ih 
heen considered in §§8.5 and 8.8. see 
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In the general case, we may take x(x) as the appropriate multiple of the 
polynomial 


jn(t) = Cu(l — a)-a( + a)* £5 [a — set + 2) 


oe (im tat B+k\(mto\(z—1V 
(—ayrc,arm! ("23 Vim tee) 


k=0 
COmV m(2) 


which, as was noted in §7.8, reduces with a certain (not universally agreed 

upon) choice of Cn to the mth Jacobi polynomial. } 
The coefficient of x” is found to be 

Tim +a+6+ 1) sis 


whereas the normalizing factor is obtained, from (7.4.13), in the form 





ll 


(8.9.2) 


tll 


(8.9.3) 








An = (—1)" 


TQmtate+) : 

! = atm B+m 
Cam! om Fa +B +1) _o axyetm(1 + x)bt™ dx 
tant POM + a +B +1) Pom + a + Um + BEY) omtessts 
m™" Tim -at6 + 1) T(Qm +a+ 6 + 2) 
gimtattim! T(m + a+ 1)r(m+8 + 1) 
im taeperl retreat et+)) , 


where use was made of the formula 


Vip + Iq +) 
T(p + q + 2) 


The results (8.9.2) to (8.9.4) reduce to (8.5.1) to (8.5.3) when a = B = 
with the choice Cm = (—1)"/ (2"m!), corresponding to bn(z) = Pal 
and to (8.8.2) and (8.8.3) when a = 6 = —% with the special choi 
Cm = (—2)"m!/(2m)!, corresponding to ¢m(z) = T m(). 

Thus we obtain the quadrature formula 


[2 - a0 + 29) de = \ Wife) +H, 8.98 
k=1 





Ym 





(8.9.4) 





= C2 





l 
[ ; (1 — x)P9(1 + 2)¢da = Qetatt 
(8.9.5 


where 2; is the ith zero of V»(z), and where, from (8.4.22) or (8.4.23) 
(8.4.10), 


om tatB+E20(m+atirim+6 + Qimtatbtlmn | 


He ap tl Tomtat etl) Vate)Vmralm) 
(8 





+ See Szego [220]. The choice made in that reference is Cm = (—1)"/(2m!), 
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and 


E= Tim tat )l(m+ 6+ 10 (m+ a4 B+ 1) 2Q2mtatst+1m! 





Qmt+at+B84+1(TQm+atBt+ lp my! fom (£), 

" ie (8.9.8) 
where <1. I i . 

quadrature. ntegration based on (8.9.6) is often known as Mehler 


It is possible to establish the relation (see Szego [220]) 
(2m + a+ 6+ 2)(1 — 2?)V,(@) = (m+at+B8+D)[(Qm +a+6t 2)z 
+ (a — B)]Vn(x) + (m+ a+ 8 +1)Vmii(z), (8.9.9) 
from which there follows 


) a mmtatst?2 
Vin-i(Xi) mtatBti 


so that (8.9.7) can also be written in the somewhat simpler form 


H; = Tim +at1)l(m+ 641) Qetattim! 
Tim +a+6+4+ 1) (i — «(Via 


As an example, we consider the weighting function 


(1 — x})V;, (xi), 





(8.9.10) 


w(x) = 1 — 2, (8.9.11) 


in which case (8.9.2) gives 
ad™ 
Vee) = (1 = at) 5 (1 = ayn, 
By making use of the relationship 


EGE = cg =! 
age GER! 


this result can be written in the form 


iP, (2) 
dx* ’ 





(gh 1) (8.9.12) 





Vn(x) = (- 1)”2"+1m! dP m41(X)_ 
m+ 2 dx 
Ilence there follows 


f,- sed ) 
‘ x)f(x) dx = y Hif(ax) + E, (8.9.13) 
k=1 


where a; is the 7th zero of P/,,,(x), and where, from (8.9.10) and (8.9.8) 





= a)[Pea@r (8.9.14) 
d _ mim + 2)!) (m+ 1)! 2 92m 
te | ~ Qm)T ea 5m gfe). (8.9.15) 
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Since Pn41(x) satisfies the differential equation 
(1 — 2) P!,,(2) — 20P,41(2) + (m + 1)(m + 2)Pm4i(z) = 9, 
there follows 
(1 — 22)P!.(e) = —(m+ Ym + 2) Pins(%i) 


when P’,,,(zi) = 0, so that (8.9.14) can also be expressed in the form 
















2(1 — 2#) 


= ; 8.9.16 
~ (m + 1)(m + 2)[Pm4i(ai)}? ( 


Hi; 





By making use of the Stirling approximation (3.7.5) to the factorial, 
and of the fact that P(m + k + 1) = (m+k)!~mmlasm— ~, when 
k is fixed, we find from (8.9.8) that 


i (2m) 
E~ i oS (8.9.17) 
when m is large, in the general case of (8.9.6). This result holds, ig 
particular, in the special casesa = 8 = 0 (Legendre-Gauss),a = 8B = —¥ 
(Chebyshev-Gauss), and a = 6 = 1 (8.9.13). 

Thus, if f(x) possesses singularities in the finite part of the complex 
plane, and if the singularity nearest £ is at a distance R; from &, so that 
fm (£) is generally of the order of magnitude of (2m)!/R?" when na 
large, the magnitude of the error is generally of the order */ (2R:)™. 
Accordingly, if there exists a singularity at distance less than + from the 
segment (—1,1) of the real axis of a complex x plane, the sequence of 
approximations afforded by increasing values of m may be slowly con- 
vergent, or divergent, as in the case of the Newton-Cotes formulas (§3.7). 
In the Hermite-Gauss quadrature, the error in this case is of the order 
a/a m!/(+/2 Re), while it is of the order (m!)?/R?" in the Laguerre- 
Gauss case. Whereas such errors would increase rapidly with increasing 
m if R; remained bounded, in both these situations the permissible range 
of &, and hence of Rz, is infinite, so that the relevant value of Ry may itself 
increase rapidly in magnitude as m increases, as Was noted in §8.7. a 

8.10. Formulas with Assigned Abscissas. In some applications it is 
desirable to prescribe one or more of the m abscissas to be involved in a 
quadrature formula. In particular, whereas none of the true Gaussian 
formulas involves the values of f(x) at the ends of the interval, it is some= 
times important that one or both of these end values be used, It may be 
expected that, for each arbitrarily prescribed abscissa, the degree of pre= 
cision generally will be reduced by unity below the maximum value 
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2m — 1. In particular, if all abscissas were prescribed at random, the 
maximum degree of precision would generally be reduced to m — 1. 
Clearly, exceptions occur when the abscissas are preassigned in special 
ways. 

Whereas the Gaussian formulas were derived in §8.4 from the Hermite 
formulas, by requiring that the m weights A; vanish, a somewhat different 
approach (which could also have been used in the Gaussian case) is 
desirable here. . 


We recall first that the Lagrangian quadrature formula 


b m 
w(x)f(x) dx = Wif (ax) + E, 
| 2, 





(8.10.1) 
where n(x) = (@ — a1)(@ — a2) + + + (4 — am) (8.10.2) 
and W, = i: LO deo i * (a) 22). (8.10.3) 

an ‘ as Oe — a ™ .10. 


always has a degree of precision of at least m — 1. Now any function 
f(z) can be expressed as the sum 


Lm;%], 


f(&) = Pm—1(@) + w(x) f[x1,22, .. . (8.10.4) 


Where Pm—i(x) is the polynomial, of degree m — 1 or less, agreeing with 
f(x) at the m points x, . . . , &m, and where f[zi, . . . ,2m,%] is the mth 
divided difference of f(x), relative to x1, . . . , &m, defined in Chap. 2. 
Hence f(x) can be replaced by that sum in (8.10.1). But since the result 
of replacing f(x) by pm—s(x) will vanish, and since x(x;) = 0, we thus 
obtain the expression 


E= i w(x)r(a) fle, . « 


for the error E in (8.10.1) t+ 

If f(x) is a polynomial of degree m + 7, its divided difference of order 
m is a polynomial of degree r, and conversely. Hence we deduce from 
(8.10.5) that the quadrature formula (8.10.1) has a degree of precision of at 
least m +r — 1 tf and only if the polynomial x(x), whose m zeros are the 
abscissas, ts orthogonal, relative to w(x), to all polynomials of degree less 
than r. 

When r = m, this result reduces to the result of §8.4 and serves to 
specify the Gaussian quadrature formulas, which were also derivable as 
special Hermite formulas for which H; = 0, and for which also H; = W,. 

Now suppose that m — r of the m abscissas are preassigned, leaving 
the r “free” abscissas a), v2, . . . , to be determined so that the degree 


| This result is equivalent to (5,11,34), The derivation is repeated here, for com- 
leteness, in the modified notation of the present chapter. 


- Lm,x] dx, (8.10.5) 
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of precision will be maximized. If we write 


x(x) =((e — a1) + (@ — ae)IM(@ — trys) + + + @ — tn) = FE), 
(8.10.6) 
where a(x) = (4 — 21)(u — 42) + + * (@ — 2) (8.10.7) 


is a polynomial of degree r whose r zeros are the free abscissas, which are 
to be determined, and where 


v(x) = (a — Sri) (% — Bree) °° * (x — Im) (8.10.8) 


is a known polynomial, of degree m — 7, whose zeros are the preassigned 
abscissas, the condition 


Pe w(a)r(2)Ura(x) dx = 0, (8.10.9) 


where U,_1(x) is an arbitrary polynomial of degree r — 1 or less, takes the 
form 


[2 (w(e)o@) Ie (e)ua@) dr = 0. (8.10.10) 

Thus we may consider #(x) as the appropriate multiple of the rth 
member of a set of polynomials ¢o(x), $:(z), -- +5 Or(2),-- +5 of 
degrees 0, 1,...,7,.. +, respectively, which are orthogonal over 
(a,b) relative to the modified weighting function 


w(x) = w(x)v(x), (8.10.11) 


and the methods of §7.4 are again available for its determination. How- 
ever, if v(x) changes sign in (a,b), the modified weighting function (2) 
will have the same property. Thus there is then no assurance that the 
zeros of $,(x) will be real or, if so, that they will lie inside (a,b). 

In the important cases for which only one or both of the end points 
2 = aand x = bare taken as preassigned abscissas, so that v(x) is given 
by a — a,x — b, or (x — a)(x — b), this difficulty does not arise since then 
v(x) is of fixed sign in (a,b). Attention will be restricted to these cases in 
what follows. 

In order to evaluate the weights W,, we write #(x) = ¢,(x)/A,, to tak 
into account the fact that the polynomial ¢,(x) which is most convenient; 
employed may not have unity as its highest coefficient, and notice tha 
then 


x(x) = i v(x) p(x), 


where v(x) is defined by (8.10.8). Equation (8.10.3) then becomes 


[’ w(e) v(s)er(e) dx, (8.10. 


1 
Wi 5 {[v(a) b-(x)] Feast 
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and is, of course, independent of A,. 


( For? = 1,2, . 
x; is a zero of ¢,(z). 


.. ,7 the abscissa 
Hence there follows 


1 b 
W; = v(adi(a) ‘ w(x)v(x) o2) as, (8.10.13) 


for7 = 1, 2, . .. , r, and a comparison of this form with (8.4.22), with 
m replaced by r and w(x) by w(x), leads to the desired result 


W:= - Artie 
A,v(xi) 61 (X;) br4-1( i) 


where A, is the coefficient of x" in ¢,(x), and where 


G=1,2,...,7), (8.10.14) 


ir = [? W@le@P dz = [? w@@)lo(e)P de. (8.10.15) 


Equation (8.10.14) determines all weights except those corresponding 
to the preassigned abscissas. 


In the case when only the abscissa x = a is preassigned, so that 


ie = x — a, the corresponding weight is expressed by (8.10.12) in the 
orm 


1 b 
W= aa) [ w(x) (x) dx (x; = a), (8.10.16) 


whereas when only x = 6 is fixed, so that v(x) = x — b, there follows 


6 
W = tO i w(2)d-(2) dz (a; = b). 


(8.10.17) 
In the case when both z = a and x = b are fixed, so that 
v(x) = (« — a)(x — dD), 
there follows 
1 b 
We= CE IO) [ (6 — x)w(x)¢,(x) dx (a =a) (8.10.18) 
and W= oaone | (a = a)w(x) d(x) dx (2x; = b). (8.10.19) 


Alternatively, the weights corresponding to the prescribed end ordinate 
or ordinates can be determined in terms of the remaining weights by use 
of one or both of the relations 


. b 
ryW, = 
ry Y mW =f ew(2) de, (8.10.20) 


which require that the error in (8.10.1) vanish when f(z) = 1 and when 
{(v) = x, respectively. 
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The special cases in which w(z) is constant are treated in the two follow- 
i lions. 
ain general case, it is possible to show that the relevant quadrature 
formula can be obtained by replacing f(x) in the integrand by the poly- 
nomial of degree m + r — 1 which agrees with f(x) when T= Ty. ++, 
z, and whose derivative agrees with f’(x) at the unassigned points 7 = 2%, 
. , a Thus the error can be expressed in the form 


B= [Pw(ayl@ — 2) «++ @— BNR ~ ten) + = He) 


-flaiti,..- Lm,x] dx. (8.10.21) 


yUr Lryrpty + + + 


In particular, if w(x) 2 0 in (a,b), if no assigned abscissas lie inside 
(a,b), and if ft” (a) is continuous, there follows also 


fort (8) . 


= (m + r)! w(x)[(@ -— a) °° * (e — 2,))?(@ — Lr41) 


- + 4 (@ — &m) dx 
mrr b 
a a w(2)[a(a) 2 dx 
Vr fim) (€) 


= 7) (8.10.22) 
~ A2(m+r)! 
where £ lies between the largest and smallest of x1, ee and b. 
8.11. Radau Quadrature. In the case of a finite interval, with a unit 
weighting function, when one end of the interval is assigned as an abscissa, 
it is again convenient to suppose that the interval has been transformed 
to (—1,1), with « = —1 as the fixed abscissa, by an appropriate change 
in variables. We then have 


viz) =a4+1 
and ra) = @ + Dr), 


(8.11.1) 
(8.11.2) 


where #(x) is a multiple of the rth member of a set of orthogonal poly- 
nomials ¢o(z), ¢1(2), - - - » r(x), - - - , Which has the property 
/ ' @ + Vde(a)u ale) dx = 0, (8.11.8) 
where u,_1(x) is an arbitrary polynomial of degree r —'1 or less. 
If we follow the procedure of §7.4, by writing | 
Soy Ry cuby toe . e 


= Ur(a) 


and integrating the left-hand member of (8.11.8) by parts r times, we fi 
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that U,(x) must satisfy the equation 
qdtti 1 dt 
dx} 5 + 1 dat v,| =U 


and the requirements that U,, U/, ..., U’-» vanish when xz = +1, 
and hence that U, must be of the form 


U, = Cpe + 1)(@? — 1), 
Thus it follows finally that 





C, d’ 


(a) = 5 ole + YG@ — 1), 





(8.11.4) 


which can be expressed in the form 


C, d’ r—1 
$,(z) = re |e + 1) dxt (et DP as a (a? — 1 | 


or, by making use of the relationship (8.9.12), in the form 





= 2rr! P. z 1 , : 
$,(x) 2’r IC, (a) + 1 Pra | (8.11.5) 
It is convenient to take 


1 
r= Orr! (8.11.6) 





Then, noticing that here r = m — 1, since only one abscissa is pre- 
assigned, we conclude that the m — 1 free abscissas are the zeros of 


1 1 qm 
dm—1(2) > 2”—1(m _ 1)tz + 1 dx [(x + 1)™(x = aged 





= Pna(t) +2—+ pra) =? = a? m(@), (44.7) 


where the last form follows from the recurrence formula (8.5.7). The 
leading coefficient is found to be 


An—1 





= (2m — 1)! 
~ 2"—Im[(m — 1) 


With this result, we notice next that 





(8.11.8) 


1 
dasnk / (1 Bb iC) eile) de 
= 


1 
re Ant | ad + ©) dm—1(x) dx 


1 el 
= pei iyy | am foe (+ a)(at — 1) ae, 
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and an (m — 1)-fold integration by parts, followed by the use of (8.9.5), 
leads to the simple result 


2 
Yn-1 = a (8.11.9) 
Thus, by introducing (8.11.8) and (8.11.9) into (8.10.14), we obtain the 
weights 
2(2m + 1) 1 
mim + 1) (1 + 21) 641 (@:) om (a) 
corresponding to the m — 1 free abscissas. By making appropriate use 
of the formula (8.5.7), together with the fact that ¢n,-:(%;) = 0 implies 





Wh cece (x; ~ —1), (8.11.10) 


Ppa(i) = = PL_(u) = —Palt), 


we find, after some manipulation, that 





2m +1 
bm—1(ti) = oy Peale dm(Xi) = — mil Pm—1(%i), 
so that (8.11.10) reduces to 
Va i : (vw # —1). (8.11.11) 





~ m2 [Pm—1(x:)]? ~T-% (Pose)? 


The weight corresponding to the abscissa x = — 1 follows from (8.10.16 


in the form 
1 
ree m—1(L) ax (x = —1). (8.11.1 
W 7 EV [ie 1(2) 


We obtain first 


m 





or, after integrating the second member by parts and noticing that 
first member integrates to zero (when m > 1), 


(- 1m 12 


1 2 7 
iZ dm—1(£) dx = menu) ~ m 


since P,1(—1) = (—1)""!._ By making use of the additional fact t 
P'_,(—1) = (—1)™m(m — 1)/2, we obtain also 


om—1(— 1) bad (—1)*-1m, 
and hence (8.11.12) becomes 


2 


W = m: (x, = —1). 
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Thus, in summary, we have obtained the quadrature formula 


m—1 
1 
[16 ie SZ f-1) + > waste) +E, (8.11.15) 
k=1 


where 2; is the ith zero of the polynomial (8.11.7), and where the weights 


are defined by (8.11.10) and are positive. This formula is one of several 
attributed to Radau. 


In order to obtain an expression for the error E, we first specialize 
(8.10.5) to the form 


« ,Um,t] ax, 


B= [* r@fles, . . 


appropriate to the present case. 


(8.11.16) 


We have also 





1 ‘m—1 
w(x) = (© + 1)a(z) = pea bm—1() = 7 (8.11.17) 
where 


U(z) = Un_i(x) = moi (e+ 1)™@ — 1)", (8.11.18) 


Thus (8.11.16) becomes 


if eer 
Am_1 fe: dx” fle, . . 
If we integrate by parts m — 1 times, making use of the fact that U ; 


U’,..., U%-» all vanish when x = + 1, and of the fact that 
aq”! (m — 1)! 


E= 





» Lm, x] dx. (8.11.19) 





danni fle, 26s km, 2] = (2m — it"), (8.11.20) 
in accordance with (5.11.36), where In| < 1, there follows 
—1)r"(m — 1)! [? 
= ‘Gm = Dae _, U@ser-P(n) dx. (8.11.21) 


Now, since U(x) does not change sign in (—1,1), the second law of the 
mean can be applied, to give 


Bm DEM DID I oi 
-1 





(2m — 1)!An—a 
(m — 1)! 2”Im[(m — 1)! 1 » () 
~ Qm =I) m=! imo Hor? 


[2+ me = 1a 
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or finally, after using (8.9.5), iE. F@) dt = 2F(0) + FQ) + atel’"(n) = =(0<9 <1), (8.11.26) 
and similar forms can be obtained in the other cases. 

The abscissas and weights corresponding to formulas for which 2 < m 
S 5 are listed, to six digits, in Table 8.4. 

8.12. Lobatto Quadrature. In the case when both ends of the interval 
(—1,1) are preassigned as abscissas, the weighting function being unity, 
the derivation is quite similar to that of the preceding section. 

Thus, with 










22—1m{(m — 1)!]4 


a [Qm — 1) 


p iiaaamtes ¢ 3) (lg| <1), (8.11.22) 





in accordance with (8.10.22). 
The first six of the polynomials (8.11.7) are found to be of the form 


doz) =1, giz) = 3(8e—1), — a(x) = 3(52* — 2x — 1), 
o3(x) = 3(35a3 — 152? — 15x + 3), (8.11.23) 





o4(z) = gy(315x4 — 1402? — 210x? + 60x + 15), u viv) = 2? — 1, (8.12.1) 
os(t) = ps(23125 — 105¢4 — 21028 + 70x? + 35x — 5), it is found that 
C, a : 
and additional ones can be obtained from the recurrence formula $(x) = ot ae Gals (8.12.2) 


and that, in accordance with (8.9.12), if we set 


(at (8.12.3) 


aaa 


1 
bryi(%) = G+DG +) {[(2r + 1)(2r + 3)x — 1]¢,(2) 
— r(2r + 3)¢-r(@)}, (8.11.24) 


or by reference to (8.11.7). 
this result is of the form ¢,(x) = P/,,(x). Hence, since herer = m — 2, 


the free abscissas are the zeros of the polynomial 


m 1 gn? 


TABLE 8.4 


m Abscissas Weights 





bm—2(t) = P,,a(z) = 





, (a? — 1)"-1, (8.12.4) 


1 





2 mee | z 2”—-1(m = 2)! x? —_ 1 dx™— 
1 3 
3 2 The additional results 
3 =| 0.222222 (2m — 2)! 
—0.289898 | 1.024972 An—2 = 5an1 i \ (8.12.5) 
0.689898 | 0.752806 2 be ~ i — 2)! 
4 -1 0.125000 and Yn-2 = — oatn — 1), (8.12.6) 


—0.575319 0.657689 
0.181066 0.776387 
0.822824 0.440925 


5 —1 0.080000 
—0.720480 0.446207 
—0.167181 0.623653 
0.446314 0.562712 
0.885792 0.287427 


the negative sign in (8.12.6) being a consequence of the fact that here 
v(x) is negative in (—1,1), are obtained by methods similar to those of the 
preceding section. Next the weights corresponding to the free abscissas 
are obtained in the form 

2m 


=" G=PL@Pe) = +), 


which can be rewritten more conveniently as 


, 2 
| m(m = 1)[Pm—1(a;)]? 














In the simplest nontrivial case, m = 2, there follows 11 = 3. The 
weight W, is found to be $, and the weight relative to x = —1 to be a 


Thus, the best two-point formula with « = —1 preassigned is of the form Wi 





(x; # +1). (8.12.7) 
‘he weights corresponding to the fixed abscissas x = +1 are found to be 
equal and to have the value 


Wm By Ped 7 a | 7 tH, 6.128) 


[2,1@ de =D + HO 4+ HID (el <1). (6.11.25 


By setting x = 2¢ — 1, and writing f(2¢ — 1) = F(/), we may rewri 
this formula in the form 
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which is the same as that given by the right-hand member of (8.12.7) 
when x; = +1. 
The corresponding quadrature formula, 


m—2 


1 2 > 
i So tT ~1))}+ ) Wiffle) + E, (8.12.9) 
[fou ata YO) HHO +), Wal 
where x; is the 7th zero of P!,_,(x), and W; is given by (8.12.7) and is 
positive, is known as Lobatto’s quadrature formula. { 
In order to obtain an expression for the error EZ, we may notice that here 








1 dU ? m — 1)n- 
w(t) = 4 Geere Where U = aig, gy @*— D4, (8.12.10) 
and hence, from (8.10.5), we have 
1 m—2 
E a qos I oa Hes, oy tate mt] dx 
= (—1)"(m ry 2)! , (2m—2) d 
= “im — 2)1An. us U(«)f (n) dx 
(—1)"-2(m — 2) 'fem-9(¢) [1 U(e) de 
= (2m — 2)!An—s a 
or, finally, 
ae 3Q2m—1 — 2)174 me 
Bm er del <1), @.12 


in accordance with (8.10.22), by arguments completely analogous to th 
used in deducing (8.11.22). ae 
In the simplest nontrivial case, m = 3, the free abscissa is found fro 
the equation P(x) = 3x = 0 to be x = 0, as would be expected from t 
symmetry. The corresponding weight is found to be $, whereas 
weights corresponding tox = +lareeach 3. Hence, as also might ha 
been anticipated, the Lobatto formula reduces in this simple case 
Simpson’s rule. 
The abscissas and weights corresponding to formulas for which 3 
m S 6 are listed, to six digits, in Table 8.5. More elaborate tabulation 
are listed in the references. 
When the Lobatto formula is applied to a function f(x) which vanisi 
at both ends of the interval of integration, so that only r = m — 
ordinates are actually involved in the calculation, the degree of precisi 
is 2m — 3 = 2r+ 1. Similarly, when the Radau formula is applied 
an integrand which vanishes at the lower limit, so that r = m a 
nates are used, the degree of precision is 2m — 2 = 27. Thus, in 
cases, a higher effective degree of precision is attained than that affo 
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TABLE 8.5 

m Abscissas Weights 

3 0 $ 
+1 5 

4 +0.447214 3 
+1 < 

5 0 He 
+0.654654 $3 
+1 ty 

6 +0.285232 0.554858 
+0.765055 0.378475 





+7 0.066667 


by the formulas of Gaussian type, in which the use of r ordinates leads to 
a degree of precision of 2r — 1. 

8.13. Chebyshev Quadrature. By imposing various restrictions on 
the abscissas and/or weights in a formula of the type (8.10.1), various 
classes of quadrature formulas may be obtained in addition to those so 
far considered. In this connection, it may be noticed that, if the abscissas 
are required to be equally spaced, the Newton-Cotes formulas of Chap. 3 
are obtained when w(x) = 1. In this case, m abscissas are fixed and the 
degree of precision may be expected to be reduced from 2m — ltom — 1. 
llowever, as was seen in Chap. 3, when m is odd, so that the mid-point of 
the interval is one of the abscissas, the degree of precision is increased to 
m.t 

Another important class of formulas, associated with the name of 
Chebyshev, is that in which all the weights are made equal. This situa- 
tion is desirable, not only for convenience, but also in order that the effects 
of errors in the ordinates will be minimized. Here the common weight 
and the m abscissas are “free,” and it may be expected that a formula 
with a degree of precision of at least m may be determinable. However, 
this expectation is not always to be realized. 

We suppose again that the original interval has been transformed into 
(~1,1), so that the desired formula is of the form 


[/, fla) dz = WY’ fox) + BLC@), (8.13.1) 
k=1 


where W is the common weight. It may be noticed first that the weight 
W cannot be assigned, if the degree of precision is to be positive, but is 


| It should be recalled that m here corresponds ton + Lin Chap, 8, 
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determined by the requirement that EZ = 0 when f(x) = 1, in the form 


1 
W = x where ) = i io(x) dz. (8.13.2) 
-1 


Now we assume that a set of m abscissas x; exists in (—1,1) such that the 
degree of precision is indeed at least m, and, as before, we write 


a(x) = (a — a1)(u — a2) + + + (@ — Bm). (8.13.3) 


Then, following the derivation of Chebyshev, we identify f(x) in partic-— 
ular with the special function 


1 
u— 2 


(u > 1), (8.13.4) 





f(@) = 


in which case (8.13.1) becomes 


Mh ale aS) 1 | 1 | 
Poorer unm t® red oc) 


The reason for choosing the special function 1/(u — x) is now seen if 
we notice that, since 











log r(u) = bY log (u — 2x), 
k=1 


the finite sum in (8.13.5) can be expressed as 


d 
ai log r(u), 


and hence that equation becomes 


1 
[we a 


or, after an integration with respect to u, 








=> 4 tog eo) +B [+ | 
ie w(x) log (u — x) dx = const + * Jog r(u) — Q(u), 


Qu) = th E du. 


Equation (8.13.7) can be resolved in the form 


where 


1 
m(u) = Cm exp K fs w(x) log (u — x) dx + ; aw}, 
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or, equivalently, in the form 


a(u) = Cru” exp E i w(x) log ( — *) az| exp E aw} (8.13.9) 


Now the error term in (8.13.6) is expressible in the form 


1 x : qmti 1 
|; = ;| = [so F544) ia 
: d 


=(m +1)! [6 @=ne (8.13.10) 





where G(s) is the influence function defined by the relation 


1 
miG(s) = 1 w(z)(2 — 8)" dz — A » (% —s)", (8.18.11) 


tk28 


in accordance with (5.11.15) and (5.11.16). Accordingly, there follows 


0 1 
a d 
Q(u) = I Edu = mt [a Cee 


For Present purposes, it is not necessary to evaluate this expression 
explicitly. However, it is important to notice that it can be expanded in 
the form 


1 
Q(u) = mien | o0) E + (m+ 154 (ae Don + 2) s* 


u? 


(8.13.12) 





+++ +[ds 
= 9 eh 


ymti ymt2 “ay 





(8.13.13) 


since w > 1, where the coefficient g, is a certain multiple of | os s'G(s) ds. 


Similarly, we see that 


1 1 
[we log (1 - =) ar = - [ve (@+2+4 ae ‘) ae 


(8.13.14) 


when u > 1, where 


1 
C= fe whw(a) dx. (8.13.15) 
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Hence (8.13.9) can be expanded in the form 
Cau exp | - m(% + jai +:- ‘| 

esp (™ (2, + fa +) 
Cau (1 —Fo+:- Na+ to + . ‘ (8.13.16) 


where the two relevant power series in u-! converge when u > 1. But, 
since (wu) is a polynomial of degree m, the product of the two series will 
terminate before the term containing u-"-!. Thus the terms in the second 
series, after the leading term, therefore do-not enter into the determina- 
tion of the terms which will remain in the product, but serve only to 
bring about the cancellation of all terms involving wu", u-™—*, and so 
forth. Hence the second series can be disregarded, and the desired poly- 
nomial can be obtained by merely terminating the first series with the term 
involving u-". Also, since the coefficient of uw” in x(u) is to be unity, we 
must take C;,, = 1. 

It thus follows that if +(x) exists such that (8.13.1) has a degree of pre- 
cision of at least m, then (x) is defined by the expansion 


exp [2 i w(t) log (2 — #) a| = x" exp E [, w(t) log ( ” ‘) at| 


Ce = ym me, m—1 er 
+5 EST & , 


(8.13.17) 


where the last series is to be terminated with the last term having a non= 


negative exponent. 
In the special case when 


(u) 








C1 


m 
= mexp|-2 ie 


w(az) = 1, 
and hence also 
(ao. Was 
m 
the first four terms of the expansion of 
1 
m t 
a” exp E [, log (1 — ‘) ar| 
1 bef & 1 
= ar exp | —m\ era + aoet T daet TT 
are found to be 
oe m m—2, m _— m—4 
n(t) = 2% ri |. 360 (5m — 18)a 


os 75580 (35m? — 378m + 1080)a"-8 + +++, (8,18, 
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where the series is to be terminated with the first term if m = 0 or 1 
with the second if m = 2 or 3, and so forth. If the mth such palynoduial 
is denoted here by G,,(x), the first six such polynomials are thus obtained 
as follows: 
Go(x) a iL G(x) = 4, G2(z) =5 (3x? am 1), 
G3(x) = 3(2a? — x), Ga(x) = 7s(45x4 — 302? 4+ 1), 
Gs(x) = 7y(7225 — 60x* + 7x). 


(8.13.21) 


It is seen that the polynomials of even and odd degrees are even and odd 
functions of x, respectively, so that their zeros are symmetrically placed 
about x = 0. 

It has been found that the zeros of the polynomials G,(z) G2(x) 
. . « G(x) and G(x) are all real, that they lie inside the interval bs 1 1), 
and that the quadrature formula (8.13.1), with abscissas identified with 
a set of such zeros, accordingly does indeed have a degree of precision 
equal to or greater than the number of abscissas, when w(x) = 1. How- 
ever, six of the zeros of Gs(x) are complex, and each Gp(x) for m = 10 
possesses at least one pair of complex zeros (see Bernstein [36]). Thus 
baie se = 1, the quadrature formula is useful only when m < 7 and 
m=9, 

The abscissas corresponding to the formula 


[1 de = = > fle) + E, 
k=1 


for all relevant values of m, are listed to six digits in Table 8.6. 


(8.13.22) 


TABLE 8.6 
m  Abscissas m_ Abscissas 
2 +0.577350 7 0 
i. oy +0.323912 
+0.707107 £0. 520657 
+0. 883862 
4 +0.187592 
+0.794654 9 0 
: +0. 167906 
0 +0. 528762 
+0.374541 +0.601019 
+0.832497 +0.911589 
6 +0.266635 
+0.422519 
+0.866247 


Whereas the appropriate error term in each case can be expressed in 


the form 


Hom [*, G(s (a) ds, (8.13.23) 
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where G(s) is defined by (8.13.11), with w(x) = 1, recourse to the third 
method of §5.11 leads more simply to the desired results. For this pur- 
pose, we may notice first that, since the coefficient of 2” in Gn(x) is unity, 
there follows w(x)a(z) = G(x). Further, by integrating the expressions 
given in (8.13.21), and determining the constant of integration in each 
case such that the integral vanishes at one (and hence both) of the limits 
+1, there follows 


G(x) = [2@? — DY, Gr) = [@* — 2)]' = fre@ — 1)7"" 
G(x) = [ga2(e? — 1], Gaz) = [es(92* — 102° + 2)I’ 

= [yo(z? — 1)°(1 + 32%)”, 
Gs(x) = [xta(x? — 1) (244 — 62? + DY, 


(8.13.24) 


and so forth. Thus, when m is odd, there follows G,(x) = V;,(x), where 
V, vanishes at the ends of the interval (—1,1) and is of constant sign 
inside that interval, whereas, when m is even, there followsGn(x) = Vin (2), 
where V,, and V’, vanish at the ends of the interval and V,, is of constant 
sign in the interior. 

It follows from (5.11.38), with n + 1 = m, that the error £,, associated 
with an m-point formula is given by 


fiomt)(E) 
Cm S—— (m odd), 
ie von is (8.13.25) 
Cn (m + 2)! (m even), 
- ie Vmn(u) dx = de, xGn(z) dx  (m odd), 
where Cn (8.13.26) 


2 ae V(t) dv = is 2°Gin(x) dx (m even). 


The first six of these values are found to be ci = 3, c2 = ¢s, Cs = YB, 
C4 = ges, Cs = 78's) Co = TET: 

In the case m = 2, formula (8.13.22) reduces to the Legendre-Gauss 
two-point formula. It may be noticed that the degree of precision is m 
when m is odd, but ism + 1 when mis even. More generally, whenever 
w(x) is an even function of z, it is apparent from the symmetry that both 
members of (8.13.1) will vanish when f(x) is any polynomial of odd degree 
(or any odd function of x). Hence, in such cases, if m is even and if the 
degree of precision is at least m, then it is also at least m + 1.f 

The Chebyshev-Gauss formula of §8.8, with w(x) = (1 — «*)-1, is @ 
particularly notable member of the general class of formulas considered 














+ The difference between this situation and that relevant to Newton-Cotes quadrae 
ttre is a consequence of the fact that there the minimum degree of precision is m = 1, 
where m ordinates are used. Thus an increase of (at least) one degree occurs if 
m — 1 is even, and hence m is odd, 
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in this section, since in that case it was seen that the degree of precision 
attains tts maximum value 2m — 1, in spite of the fact that the weights 
are equal. It can be rederived here by noticing that (8.13.2) gives\ = a, 
and hence W = x/m, in accordance with (8.8.11). Equation (8.13.17) 
then gives 


1 
exp E i: w(t) log (# — #) ar| = exp B . log (@ — 4) a3 a| 


a ea |m log (ive ve=t)| - Giveaty 
am 1 ™m 1 m 
el oa 


when x > 1, and the polynomial part of the last indicated expansion can 
be shown to be identical with the expanded form of 


2)-"T,,(@) = 2!-™ cos (m cos! 2). 


8.14. Algebraic Derivations. Any specific one of the quadrature for- 
mulas considered in this chapter can be obtained directly by purely 
algebraic methods, without the use of properties of orthogonal functions. 
In cases in which the weighting function is given empirically, or in which 
only a single specific formula is desired, such methods are often to be 
preferred. For this reason, they are discussed briefly in this section. 

We suppose here that the formula is to be of the form 


I w(x)f(x) dx = > Wif(ax) + E, (8.14.1) 
k=1 


where w(x) = 0 in (a,b), and that the abscissas and weights are to be 
chosen in such a way that the degree of precision is at least m — 1, so 
that EH = 0 at least when f(x) = 2" (r = 0, 1,...,m-—1). If we 
define the rth moment M,, associated with w(x) over (a,b), by the equation 
b 

[, w(x) dx =M, (r=0,1,2,...), (8.14.2) 
the requirement that the degree of precision of (8.14.1) be at least N is 
represented by the N + 1 conditions 


> Wat}, _ M, (r al 0, 1, “a '¢ » N). (8.14.3) 
hel 
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Whereas these equations are linear in the m weights Wi, they are non- 
linear in the m abscissas x;, and the purpose of this section is to indicate 
in what way the difficulties associated with this nonlinearity can be 
eae eee to be used, in those situations in which no conditions 
are imposed on the weights, may be easily generalized from the simple case 
in which m = 2. Hence, in order to simplify the notation, we consider 
that case specifically, but describe the procedures in general terms. 

The simplest case, clearly, is that in which the m abscissas are pre- 
assigned. Then, unless they are chosen in a special way, we can le 
only that the degree of precision N be at least m — 1. When m = 2, 
the two conditions to be satisfied are then 


Wi+ W2 = Mo, 


(8.14.4) 
W141 + W x2 = M,. 


Since the abscissas are assigned, we have m simultaneous linear equations 
in the m unknown weights, and it can be shown that these equations 
always possess a unique solution. 

On the other extreme, we have the Gaussian case, in which no con- 
straints are imposed and in which the degree of precision is to be 2m — 1. 
In the case m = 2, the four conditions to be satisfied are then of the form 


Wit W: = Mo, 
Witi + Wore = Mi, © 
W 12? + W ox Ms», 
Wir} + W 2x3 = Ms, 


(8.14.5) 


representing four equations in the four unknown quantities 11, £2, Wi, 
and W». In order to solve these equations, we let x; and 22 be the zeros 
of r(x) 

w(x) = (w — v4)(@ — 42) = 2? + aye + a, 


1 dae. If we multiply 
and attempt first to determine the coefficients a an 
the third equation of (8.14.5) by 1, the second by ai, and the first by aa, 
and add the results, making use of the fact that 


x? + ati + ae = 0, v3 + ate + a = 0, 
we obtain the condition 


M2 + Miai + Moar = 0. 


Similarly, from the fourth, third, and second equations we obtain 


requirement 
: Mi+ Moe, + Myc = 0, (8.14, 
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The last two equations are linear in a; and a>. If M ? ~ MoMz, they 
possess a unique solution. The abscissas x; and x are then determined 
as the roots of the algebraic equation w(x) = 0, and the weights W, and 
W? are finally determined from any two (say the first two) equations of 
(8.14.5). 

The generalization is obvious, since, in the general case, r(x) will be 
specified by m a’s and the 2m equations replacing (8.14.5) will provide m 
sets of m + 1 successive equations, from each of which a linear equation 
in the a’s may be obtained by the same general procedure as that which 
led to (8.14.7). These equations will (generally) determine the a’s, after 
which the abscissas are obtained as the roots of a(x) = O and, finally, the 
first m of the basic equations determine the weights. 

In the intermediate cases, in which, say, m — r of the m abscissas are 
preassigned, we can hope only for a degree of precision m + r — 1 (unless 
those abscissas are assigned in a special way), and hence there will be 
m +r basic equations replacing (8.14.5). If we again let +(x) denote 
the product (x — 21)(x — a2) - + - (x — 2m), involving the fixed abscissas 
as well as the free ones, then x(x) will again be specified by m a’s. From 
the m + r basic equations, we can proceed as in the derivation of (8.14.7) 
r times, and hence can obtain r linear equations in the a’s. The m — r 
additional linear equations needed for the determination of the m a’s then 
follow from the requirements that the m — r fixed abscissas satisfy the 
equation a(x2) = 0. 

Thus, in the case m = 2, r = 1, the three basic equations are 


Wi+ We= M,, 


W241 + W 2x2 => Mi, (8.14.9) 
Wx? + W223 = M». 

Under the assumption that 2; and x, satisfy the equation 
z+ at + a2 = 0, (8.14.10) 


we again obtain (8.14.7). By combining this condition with the require- 
ment that the preassigned value x, satisfy (8.14.10), we deduce that a; 
und a are determined uniquely by the two linear equations 


M2 + Mia + Moay = 0, 


14.1 
ri + L101 + a2 = 0, (8.14.11) 


under the assumption that Mor, ~ M,. 
There is no guarantee, in this case, that the zeros of m(x) will be real 
and distinct or, if so, that they will lie in (a,b). However, if a quadrature 


formula of the type sought exists, it can be obtained by the method 
outlined, 
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As asimple illustrative example, we suppose that a quadrature formula 
is required to be of the form 


Ie xif(a) de = Wifes) + Wof(l) + B, (8.14.12) 


where x2 = 1 is preassigned. The expected degree of precision is then 
two, corresponding to the fact that three free parameters 21, Wi, and We 
are available. We first calculate the relevant moments, 


1 

2 
Me Qrt)/2 dy = —~— = 0, 1, 2), 
M, [re 1 dx ar +3 (r 0, ) 


after which the three basic conditions (8.14.9) become 
Wit W: =, Witi + We = §, Wiz? + We = F. 
By writing (x) = (x — a1)(x@ — 1) = 4 + ait + a2, we deduce from 
(8.14.11) that a; and a, must satisfy the equations 
% + gar + Fax = 0, 
1+ a1 + a2 = 0, 


and obtain a; = —32, a2 = %, and hence x(x) = (7a? — 10x + 3)/7. 
Thus there follows 


(8.14.13) 


a1 = 3, ty = 1. (8.14.14) 
With these results, the first two equations of (8.14.18) give 
Wi=; W2=¢G- (8.14.15) 


Thus (8.14.12) becomes 


ips aif(x) dx = as5f(%) + 3f(l) + #. 


We verify that E = 0 for f(x) = 1, 2, and z*, and find that Z ~ 0 when 
f(x) = «3. Hence N = 2. 

In order to obtain an expression for the error term EZ, we may make us@ 
of one of the methods of §5.11. In particular, the influence functio 
(5.11.16) is readily determined in the form 


(8.14.16) 


—zSzst (O<ss8 4), 14.1 
G(s) = | —yhy(16s! — 49s? 4+ 428-9) (Ss 1), 6.140 


and is found to be negative throughout the interior of the interval (0,1). 
Thus the formula (5.11.31) can be used, to give ‘ 


; 7 1 
gL] ['arae — fot - 40] = —wbtes"@, 14.18 


where 0 < ¢ < 1. 
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The same result can be obtained somewhat more easily by use of the 
third method described in §5.11. For we find that 


w(e)r(a) = 4 fae — 14, 


where the constant of integration is determined so that the content of the 
brackets vanishes when x = 0. Since it vanishes also when z = 1, and 
is positive for all intermediate values of x, we may make use of (5.11.38) 
withn +1 =m =2,r=1, and V = 22i(x — 1)?, to deduce that 


ttt 1 
E= -f@ f fal (e — 1)? dx = —y}$gf""(8), 
as before. 


Finally, in the general case of Chebyshev quadrature, in which all the 
weights are to be equal, the formula is of the form 


I, ” w(x)f(x) de = W i f(a) + E, (8.14.19) 
k=1 


and the m + 1 conditions requiring that the degree of precision be at 
least m are of the form 


+++: ++ +22 = M,, 
Mitte t+az+--+ +2, = Mi, 
ti+ept+apt+- ++ +o2 = M,, (8.14.20) 
wt bap tapt ++ tam = M,, 
where we have written 
' 1 a i 
M, = pw M- = wie x'w(x) dx. (8.14.21) 


l'rom the first equation, there follows immediately My = m, and hence 


i. 4 1 
W= if w(a) dx = - M». (8.14.22) 


Under the assumption that the problem possesses a (real) solution,+ we 
gain write 


w(t) = (@ — a1)(e — 22) + + + (@ — tm) 
=m tae! + a"? + ++) tone t+ Am, (8.14.23) 
and attempt to determine the m coefficients LUT ae 


| As was pointed out in §8,13, this assumption is not always valid, 
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First, by multiplying the first equation in (8.14.20) by am, the second 
by ami, . +. , the next-to-last by a1, and the last by 1, adding the 
results, and using the fact that each 2; satisfies r(x) = 0, we obtain one 
linear equation relating the a’s in the form 


Mn + M. 101 + M2002 + oes + M om—1 + Man = 0. 


In order to obtain m — 1 complementary relations, we require certain 
results from the theory of so-called symmetric functions. We recall that 
the coefficients in (8.14.23) are related to the zeros of r(x) by the equations 


ti tae +a3t+++ + +2n = —a1, 


(8.14.24) 


Uwe + X13 + + + + Lm-1Xm = 2, 
LNs + Uys 4 cee + Lm—2lm—1lm = — 3, (8.14.25) 
W189 ° °° In = (—1)”am, 


where the left-hand member of the rth equation is the sum of all products 
of r of the roots and is known as the rth elementary symmetric function of 
1, £2, . . » 2m. The left-hand members of (8.14.20) are also symmetric 
functions of the zeros, since they are unchanged when any two of the 
symbols 21, ..., %m are interchanged, and it is known that the rth 
member of either set can be expressed as a polynomial function of the 
first r members of the other set. 

In particular, we have immediately 


a= —M,, 
and it is easily seen that 
a, = 4(M? — M2) = —1(Mio1 + M2). 


These relations, as well as that of (8.14.24), are special cases of the gener: 
recurrence formula 


rar + Mica t+ Mro2+- +> +Myra+M,=0 (=1,2,.. 9 

(8.14.28 

which permits the expression of each of the a’s in terms of the red 

moments M,, Mo, ..., and M,. The required abscissas are then t 
roots of the equation 

a” + aye? + aoe™ 2 + + + ft amit tan = 0, (8.14.29) 

if those roots are real and distinct. Otherwise, the desired formula d 

not exist. 
In illustration, in order to determine the Chebyshev abscissas for 


quadrature formula 


[2, = angle) de = WIG) + Ses) + fe) +B, Bld. 
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we first calculate the common weight, 


W=i[' 1-2) a2 =4, 
‘ 


and then the relevant reduced moments, 


‘as 1 _ 
eiiunea aovieeels 
M; = ft x*(1 — x*) dx = 0. 
Next, from (8.14.28) with r = 1, 2, and 3, there follows 


a= —M, = 0, _ 2 = 2(—Miay i M2) = —%5, 
a3 = 3(—Miaz — Moa, — M:;) = 0. 


Hence, the required abscissas are obtained as the roots of the equation 
x — a = 0, in the form 


m=—-Vy, w=0, 2 = VSS, (8.14.31) 


so that the desired formula is 


1 aos, 
[o, Q — 299@) de = $1 — VR) +10) + 1/4 B. (8.14.32) 


: It is easily verified that EH = 0 when f(z) = 1, x, x’, and 2’, but that 
E re 0 when f(x) = x‘, so that the degree of precision is N = 3. 
n expression for the error term is obtained most readily by use of 
=. th 
third method of §5.11. Thus, we find that een 


[2 wor a = [7 sna = eat 


—r30(20r* — 39x24 + 182? + 1) 

= —rho(o? — 1)*(202? + 1), 
80 that w(x)m(x) = V’(x), where V(x) = —yh5(x? — 1)2(20x? + 1). 
Since V(x) vanishes at both ends of the interval of integration and is of 


constant sign inside that interval, use may be made of (5.11.38), with 
n+1=m=3 andr = 1, to give 


E= -£8  ¥(a) de 
! es | ; 
or E = zoof'"(8), (8.14.33) 


where |{| <1. The same result can be obtained, somewhat more labo- 
riously, by use of the appropriate influence function (5.11.16). A check 
is afforded by an application of the formula to f(x) = x4. 

8.15. Supplementary References. Classical references include Mehler 
[146], Chebyshev [54], and Radau [184, 185, 186]. For more recent con- 
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tributions, see Shohat and Winston [211], Winston [238], Bernstein [36], 
Burnett [52], and Greenwood and Danford [96]. The relationship 
between Gaussian quadrature and osculating (Hermite) quadrature is 
pointed out by Fort [82]. Salzer [271] presents a table of coefficients for 
osculating quadrature. Rosser [197] warns of dangers in indiscriminate 
use of Hermite-Gauss quadrature. For the use of Gaussian quadrature 
in the numerical solution of integral equations, see Reiz [192], Nystrém 
[169], and Scarborough [17]. Gaussian abscissas and weights are given 
by Lowan, Davids, and Levensen [252] for the Legendre case; by Green- 
wood and Miller [250], Kopal [251], and Salzer, Zucker, and Capuano 
[272] for the Hermite case; in [274] for the Laguerre case; by Radau [185] 
for the Lobatto case; and by Salzer [267] for the Chebyshev case. Salzer 
[266, 267] gives tables facilitating error detection in the Chebyshev and 
Hermite cases. For examples of the algebraic derivation of quadrature 
formulas of Gaussian type, see Beard [33]. 


PROBLEMS 
Section 8.2 


1. Obtain the formula 
fs = (1 + 28)(1 — s)*fo + (3 — 2s)s%fi + s(L — s)*hf, — s2(1 — s)hf; 
4 
+ F po@erd — 9) 
where f, = f(ao + hs) and 2) <&<a0 +h, if 0 $s $1, and deduce the special 
formula 
_ i how 7 h4 iy 
fy = Rot fd +3 I) + ggg h@- 
2. Obtain the formula 
fa = (4 + 38)82(1 — 8)?f_1 + (1 — 82)%fo + 7(4 — 3s)s(1 + 8)%f1 
+4 1 + s)s2(1 — s)*hf_, + 8(1 — 8*)*hfy — $(1 — 8)s*(1 + 8)*hfy 
6 
+H paper — ot) 
where f, = f(to + hs) and a —h <t <a. +h, if |s| $ 1, and deduce the special 
formula 


fy 


3h / J / hé 
ras(1 fa + 72f> + 45f1) + jog fa + 12fs — Bf) + sia9 MM @, 


together with a corresponding formula for f_. 
8. From the following tabular values of the function 
: zsin t 
Si(z) - f or hae dt, 
determine approximate values of Si(2.5) and Si(3.5) by use of the formulas of Proba, 


1 and 2: i 
wz} 2.0 3.0 4.0 


Si(ax) 








1.605 1.849 1.758 
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Section 8.3 
4. From the results of Prob. 1, deduce the formulas 
v1 oes h h? he 
[Pi@a -Rotn+ Bag -m + mw 
and 
[2 @ - cope) az «6 hs , , he 
is 39 (Bfo + Th) + 5 fe — 3f1) + ea pre) 
where h = 2; — xy and x < = < 2; in each formula. 
5. From the results of Prob. 2, deduce the formula 
[P1@ ae = * ts +16, at ee see 
i 15 (Pa + 16fo + 7A) +35 a — 6) + ern, 


where 21 = 2%) +h = 2-1 + 2hand T1<§& <4. 

6. Use the data given in eon 3, and the first formula of Prob. 4, to obtain approxi- 
mat 1 i i 

ate . ues of the integral i: Si(x) dx for (a,b) = (2,3) and (3,4), and compare the 
sum with the result given by the formula of Prob. 5. 
Section 8.4 


. If B, 18 th Cc ffici 1 r(x hi e 
7 ( ‘ oe ent of ain 5 show that the coefficient by in the recurrence: 


bh =a, 





_ (Bus _ 2) 
Ary A; , 


so that (8.4.13) can be written in the form 


Ags | By, B A 
6 (2) = +1 Dk = byt Agi Yk 
kA a + 7 meas gi(2) at Lama ox-1(x). 





8. If yn(x) is the least-squares pol i i i 
n( ; polynomial approximation of degree n to f(x) over 
(a,b), relative to the weighting function w(z), and if r(x) is the rth relevant orthogonal 
polynomial, use the Christoffel-Darboux identity to show that 


1 b w(tf(é 
vate) = [POLO ian @onlt) — dale) dnsr(l)] dt 


9. If (a,b) = (0,1 = i i 
re ae ie (0,1) and w(x) = z, show that the abscissas in (8.4.6) are the zeros of 


an ae 
on(t) = m+ii* : a [am™*1(1 — x)™], 


that the ith weight is given by 
2m+1 


Hi; + enon ’ 
m*(m + 1)*b), (xi) om—1(2i) 





and that the error term is of the form 


eal eerie Ls | (Ss 
B= Sm + 1% {Comytplo"(e: 





——x—<<<<<———— “~<— 
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In particular, obtain the formulas 

1 lp2 lpn 

. af(x) de = 3f(3) + 72f (é) 


1 = md = — 
and ie af(x) dx = 2 ewig : we) 4? + V6, (8+) + rrtoofi*(é). 





Section 8.5 
10. After making an appropriate linear change of v: 
values of the integral 
2 2 


by use of Gaussian formulas involving two, three, four, and five ordinates, and com- 
ded by corresponding Newton-Cotes for- 


pare the approximations with those affor 
mulas, In each case, obtain an upper bound on the error analytically and verify that 


it is conservative. 
41. Proceed as in Prob. 10 with the integral 


a /2 
i / sin x dz. 
0 


12. Proceed as in Prob. 10 with the integral 
i " sin x da. 
0 


ries near x = &, and if |f (®) (£)| is of the 
so that R = R(é) is the radius of con- 
5.6) is of the order of magnitude of 


ariables, determine approximate 


13. If f(x) can be represented by a power se 
order of magnitude of k!/R* when k is large, 
vergence of that series, show that the error term (8. 


x/(2R)2" when m is large. 
14, Obtain approximations to the integral 


4 dz 
(bee: + a? 


using Gaussian quadratures with two, three, four, 
results with the true value (see also §3.7). 


and five points, and compare the 


Section 8.6 
15. Determine approximate values of the integral 


oe . 
[, e-* sin x dz 


by use of Laguerre-Gauss quadratures employing two, three, four, and five ordinates. 
In each case, obtain an upper bound on the error and verify that it is conservative, 


16. Proceed as in Prob. 15 with the integral 


oO e7 ee 
I, jae = 0.200846. 


17. Proceed as in Prob. 15 with respect to the integral 
co) ah 
—— @ te 
f, saae® 0.16776, 





omitting the analytical determination of error bounds, 
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18. Deri 
erive the results of (8.6.17) to (8.6.19), and obtain the special formula 


cite _rd +8) 26 
is e“f(a) de = Fo4 gy (2 +6 - V2FAf2+6+ V2 +68) 


+(2+8+ V2+ ,)f2 = rE 
Af(2+8B V2 +8) +S FE” pip, 


where 8 > —land ¢ > 0,in 
, in the case when m = 2 
to i . ° = 2. Also use th -poi 
a Sera the integral in Prob. 17, and compare the result te point formula 
wo-point Laguerre-Gauss formula. with that afforded by 


Section 8.7 


19. Determine approximate values of the integral 


oo 
ie? on ~ 
oe cos dx = Vre 4 


by use of Hermite-Gau 
- ss quadratures employi 
Fle ah : ploying two, three, four i 
Bs a obtain an upper bound on the error ead verify siferne Ae seer 
. Transform the integral of Prob. 17 to the form aa 


as 
oe 
sea eae dt 


and determi i 
ine approximate values by use of Hermite-Gauss quadratures employi 
ug 


? ? f ? 2 
: 
two thr ee, 1our and fiv e or dina tes Also compare the r esul ts wi th the corr esponding 


21. rom follow pat tab lated ‘ounded values of ole together with the fact 
F 0) the g U. re , 

th = => de ermine a roximate v: es of rae inte ral 

at Jo( x) J(x), t In Ppp: alu z. 


Ie . e-“Jo(x) dx = 1.570301 


by use of ite- 
Hermite-Gauss quadrature employing two, three, four, and five ordinates: 








x i 0.0 0.5 1.0 1.5 2.0 2.5 
: : 
o(x) 1.000000 0.938470 0.765198 0.511828 0.223891 —0.048384. 
Section 8.8 


22. Determine approximate values of the integral 


/ + _008 
1p ee = eof) * 2.40304 


by use of C . 
oe Saag eh se Artanis ons two, three, four, and five ordinates 
ue °° ; ound on the error i Sy die Sagke 
23. Determine approximate values of the mee et Verity, hat tb Janoneerrstivs. 


io . 
BEY; w= 7i (ey, een 


} ) 
vy use of Chebyshoy-CGauss quadratures employing two, three, four, and five ordinates 
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24. Use the results of Prob. 31 of Chap. 7 to deduce the quadrature formula 











ie V1 — x? f(a) dz saat p) sin? (“* 5) 1 (cos FG 


k=1 





r_fom(s) 
ton! (lel <2). 


mi (2m)! 
25. Proceed as in Prob. 22, using the formula of Prob. 24 to deal with the integral 


/ * /1— atcos a de * 1.38246, 
-1 
Section 8.9 
26. Determine, to six decimal places, the abscissas and weights in a formula 


i ‘ (1 + 2)¥f(x) de = Hif(es) + Hoyts) + E, 


with degree of precision equal to three, and obtain an expression for the error in terms 
of fi, Also transform the results into a formula of the form 


i at (2) dx = HiF(x,) + HiF (22) + E’. 


27. By making appropriate use of (8.9.12), obtain the quadrature formula 


[a -@a = Y mien +2, 


k=1 


when n is a nonnegative integer, with 2; the ith zero of the polynomial 


ona) = PO, () = Z Paanlt)s 
2(m + 2n)! 
mi(1 — 2%)[P&H (@a?? 


2 ! + )! 2 Q2m+2n+1 on) 1 ; 
= " la a a Im ban piler®) = lel < D 


and with H; = 





and E= 


Section 8.10 
28. Suppose that a quadrature formula of the form 


i * e-*f(a) dz = Wif(0) + ), Wif(ti) + E 
i k=2 


is required, with the abscissa 11 = 0 assigned. Show that the m — 1 free abscissas 
should then be zeros of the polynomial ¢m_i(x), where 


or(x) = Crate* f [er te>8) 


=m Cra-e* [eg dar OO) Hi a rm (re) | 
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if the degree of precision is to be maximized. Verify the relation 
ar Tox z — d 
da en) re de L(x), 


and hence, by taking C, = 1, deduce that the free abscissas must be zeros of 
$m—1(t) = Lma(z) — Li, 4(2), 
where the prime denotes differentiation. 
29. For the set of polynomials ¢,(z) obtained in Prob. 28, show that 
A, = (-1)*, y= ri(r + 1)!, 
and hence deduce that the weights associated with the free abscissas are given by 
(m — 1)!m! 
Lib (ri) om (wi) 


= (¢ ¥ 1). 
Show also that 
—1 C) 1 
= 5 @ f, fle ter@l az = = 


30. Show that the error term in the formula of Prob. 28 can be obtained by the 
following steps: 


E= Ge a fad e*rbma(x)f[ti, .. . , tm, x] dz 
= (1s f ” fiz, he tote tm] (x™e-*) dx 
=, {sles oy ge , tmc}} x™e—* dx 
= (m — 1)! (2-1) cs fee 
~ Qn — pit @ f, xz™e-* dx [see (5.11.36)] 
(m — 1)!m! 


= Gaaprt”  @ (é > 0). 


81. Show that the results of Probs. 28 to 30 reduce to the formulas 


f * eaf(x) de = HO) +42) + 478) 
and 


when m = 2 and 3, where ¢ > 0 in both eases. 
82. Use the second formula of Prob. 31 to approximate the integral 


a . 
e-* sin x dz, 
0 


and compare the result with that of using two (nonzero) ordinates in Prob. 15. 
Section 8.11 
83. Obtain approximate values of the integral 


a/2 
fo sin w de 
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by use of Radau quadratures employing two, three, four, and five ordinates, taking 

the vanishing ordinate as the assigned one. Also compare the results with correspond- 

ing ones (employing one, two, three, and four nonvanishing ordinates) in Prob. 11. 
34. Proceed as in Prob. 33 with the integral 


1 
i a cos x dx 


and compare the results when m = 2 and 3 with those given by the two explicit 
formulas of Prob. 9 (in which z is to be considered as a weighting function). 


Section 8.12 
35. Obtain approximate values of the integral 


. . 
if sin 2 dz 
0 


by use of Lobatto quadratures employing three, four, five, and sixordinates. Also 
compare the results with corresponding ones (employing one, two, three, and four 
nonvanishing ordinates) in Prob. 12. 


36. Proceed as in Prob. 35 with the integral 
1 
‘be (1 — 2?) cos x dz, 
and compare the results when m = 3 and 4 with corresponding ones obtained by using 


(8.9.13) and employing like numbers (one and two) of nonvanishing ordinates. 
37. Derive the formula 


[an—w + 47a) + 5 (cos =) | = seco @ 
=1 





1 f() as 
i V1 —— mM 


making use of Prob. 31 of Chap. 7 and of Eq. (7.8.21), and noticing that m + 1 ordi- 
nates are employed. [Show that the error term can be expressed in the form 


1 
E= [rota Loe pma,t] dz 
where w(x) = 2-™+1(2? — 1)Sm_i(a) and where xz, = cos (kr /m).] 


Section 8.13 
38. Rework Prob. 10 by use of Chebyshev quadratures employing two, three, four, 
and five ordinates, and compare the results with the results of that problem. 


$9. Suppose that independent errors «1, 2, . 
ordinates used in a Chebyshev quadrature over (—1,1), and that each of these error# 
is distributed about a zero mean with RMS deviation not exceeding erms. If R is the 


corresponding error in the approximate integral, show that ‘ 


|R|max < 2\e|max, Rrus s vr €RMB- 


Show also that the first relation holds for Legendre-Gauss quadrature while the fae 
2/+/m in the second relation is increased by only about 5 per cent when m » 3, 4, 













. ,» €m are associated with the m 
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5 (the same is true for m = 6, 7, and 
; » 7, an 9), but that somewhat larger i i i 
rene ent ae ee Newton-Cotes formulas are pal ouae % = aH 
e the values of this fact i i i rabolic 
me ne seer geal or associated with the trapezoidal and parabolic 
oe Metbares the ne es can be replaced by (a,b) in (8.13 1) and (8.13.17) 
: —,), and, in the Laguerre-Chebyshev c h Gided 
w(x) = e-*, show that the m relevant absci dei 
she Pl Ti taba abscissas are to be the zeros of the polynomial 


e 1 El. 2! ! 
2 exp [ -m (2 t5t+5+34... )| 
er pate oe m = 
x mx * + y(m — 2)am—* — 7 mt — 6m + 12)nm-3 
m 
+ 5g (m> — 12m? + 60m — 144)am-4 + eee 


if those zeros are real. Show furth 

i ; ‘ : er that, when m = 2, the quad 
t ¢ 
pees _ the oe oooh of Prob. 31, and also that ne of Rp iain 
, nm = 3, so that no three-point formula of th i 

" 

rae pee has been shown to be true for all msuch that 3 < eee peat apterds 
Pere sriied the validity of (8.13.17) with (—1,1) replaced by (— 3) in ae 
ebyshev case when also w(x) = e~*? (a modified derivation is hesaakesy. 
, 


show that the relevant abscissas 
alien ce are to be the zeros of the polynomial part of the 


mm 1 3 
e _- — sri 
. xp | m(gatagit ++ )) eet Barts im oars. 


that th 
€ zeros are real when m = 2 and m = 8, but that two of the zeros are complex 


when m=4 and a 5 Pp: 
m= * [ T he resence of com plex zeros has been est ab. it ed (0) 
( ) 1 sh f r all 


42. Show that the two-poi 
I -point formula of Prob. 41 is identical wi i 
Gauss two-point formula, and that the three-point formula is of ee ree 


[ose a = [5 (- 3) 4.50) +5(¥8)] +z 


Show also that the error term can be expressed in the form 


my 


aa 1 O(a — Fx) f[x1,22,03,2] dr 


and transformed by integration by parts to give 


a Soca 
i= 2 i = € =*(at “f Eo [x1,20,005,2,2] dx 


To 
= ei f ¥(é), 
for some value of £. 


Section 8.14 


43, Determine algebraically the unknown abscissas and /or weights for the formula 
3 


[16 ae = 2, Maen +2 
=1 
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subject to the requirement that the degree of precision be as high as possible in con- 
sistency with each of the following sets of constraints, and determine the degree of 
precision in each case: 

(a) 21 = —}, m2 = 0,22 = 5. 

(b) No constraints. 

(c) Wi = We = Ws. 

(d) 1 = —1. 

44. Suppose that the abscissas 21 = —1 and 2, =a are assigned, and that the 
quadrature formula 















[ise de = WAY + We) + Wile) +E (<a sD 


is to possess a degree of precision of at least three. Determine 2; and the three 
weights as functions of a, by algebraic methods, showing, in particular, that no such 
formula exists if a = 4, that z; is outside (—1,1) for all other a such that 0 <a < w 
and that the ordinate at « = —1 is not involved ife = + 1/3. 

45. Show that the two-point Gaussian quadrature formula of the form 


[[r@ sin na dx = Wif(a1) + Wof (ee) + B, 


where n is a nonnegative integer, is such that 


ant-NG)-8 = 3+VG)-% 
Tv 


Wi= —W.= 





n Vx? — 24/n? 
when n is even, and such that 


72 2 2 2 
an5-NG)-% = -atNG)- 


Pee Wie 
n 


when nis odd. Show also that the degree of precision is four when 7n is even and th: 


when n is odd. 
46. Show that the error term in the quadrature formula of Prob. 45 can be expr 


in the form 


E= ff teessesale@) sin na dx = F [irae sin na dx (0 <n<7), 


where r(x) = 2? — wz + 6n7? when n is even and x(x) = 2? — wx + 2n7* when n 
odd, and that, when n = 1, this expression can be transformed to 


BE =f" di x(a) sin eda = 10 =" p(t) (0<& <7m), 


[Notice that here w(x) = sin nx changes sign inside the range of integration w 
n> 1] 
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47. Derive the Gaussian integration formulas 


1 
[ir@ log rdz = —f) +E 
and : 
i f(z) log x dz = —Wif(es) — Weflas) + E 
where ) 
_15- Vi06 . 106 
21 ae 0.112009, a2 = 15 + V'106 = 0.602277 
and 


w, = 212 +9 -V/106 


= a = 0.718539,  W, = 212 — 9 V106 


424 = 0.281461. 
48..Show that the : : 
sieeo error terms associated with the two formulas of Prob. 47 are of 
1 
E= ie (x — a 7(E,4,2] log adzr = —srel” (é) 
and 
E= ? 2 5 1 
= A (a? — 3a + yey) "flai21,02,2,2] log x dx = ~srhttoof"(é) 
= —0.00012fiv(¢), 


respectively, where 0 < — < 1in each case. 


CHAPTER 9 


APPROXIMATIONS OF VARIOUS TYPES 































9.1. Introduction. Whereas polynomials are usually the most conven- 
ient coordinate functions for the approximation of a continuous function 
(or for least-squares approximation of a function which is continuous 
except for finite ‘‘jumps”’) when the desired interval of approximation is 
finite, they are well adapted to the approximation of periodic functions 
only over relatively short ranges. When f(z) is periodic and is to be 
approximated over one or more complete periods, it is desirable to make 
use of periodic coordinate functions, having the same period as f(x), im 
constructing its approximation. The most convenient set of such func- 
tions (which, indeed, satisfies all the requirements of §1.2 when f is also 
continuous) is the composite set of all sines and cosines which possess that 
period. While formulas analogous to Lagrange’s formula exist for th 
determination of such an approximation, they are seldom used, and reso 
is usually had to least-squares methods. The relevant analysis, du 
originally to Fourier and often known as harmonic analysis, is presented 
and illustrated for continuous ranges in §9.2 and for discrete ranges i 
§9.3. 

When empirical data correspond to a simple decay or growth process, 
or to a combination of such processes, and an approximation is desir 
for a semi-infinite range of the independent variable (frequently represen 
ing time), real exponential functions are appropriate coordinate functio 
On the other hand, when the superposition of two or more simple 
damped harmonics, of unknown periods, is to be analyzed, compl 
exponential functions are appropriate. Prony’s method of curve fitting, 
which includes both these cases, is presented in §9.4 and is specialized 
the second case in §9.5. 

Methods of optimum polynomial interpolation, based on preselec 
abscissas, are considered in §§9.6 and 9.7, and the Lanczos method 
improving the efficiency of a given polynomial approximation is describ 
in $9.8. 

A natural generalization of polynomial approximation consists 
approximation by ratios of polynomials, that is, by rational functio 

868 
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Such a imati i 
Pproximations are expressed conveniently in terms of continued 


fractions and are treated in th i 
e conclud 
(§§9.9 to 9.12). 5 
Pie ei wcait Approximation: Continuous Range. We suppose here 
e function f(x) to be approximated is a periodic function, of known 


period, and that the scale of uni : 
Dpses Tat of units has been so adjusted that the period is 


g sections of this chapter 


f(x + 2) = f(z). (9.2.1) 


A particularly convenient class of coordinate functions is re 


oa presented by 


1, cos x, cos 2a, . . 


-,Cosra, .. 
sin x, sin 2x, .. 


.,sinrz,.. 


“9 


— } 


ae uate a soe is of period 2x. This set has the useful property 
roauct of any two members is expressible li i 

tion of two members. Also, th ati arte cou 
ri , the derivative of each member is als 

member, and the same is true of the integral of each member tho 

constant. eee 

But the principal source of conveni i 
venience is the verifiable fact that th 
set is orthogonal over any period interval, say the interval (—7,7), so that 
’ ? 


fr. sin jx sin kx dx = 0 (7  k), 


ee cos jx cos kx dx = 0 (j # k), 


: (9.2.2) 
ie sin jx cos kx dx = 0, 


when j and k are nonn zi i i y 
“ . . 
Bes . 0. e€ ative integers Clear! ) negative integers need 


Suppose, then, that we require an approximation of the form 


f(z) ~ ag + > (a, cos kx + b, sin kx), 


k=1 


(9.2.3) 


aha the oo are to be determined in such a way that the inte 
grated squared error is least. From the periodici : 
least. periodicity of f(a) and 

aine and cosine harmonics, it follows that attention may : bores 
one period interval, say the interval (—a,r). Thus the requirement : 


J", U@) = a - D» (ay 008 ka + by 8in ke) dx = min (9.2.4) 
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leads to the conditions 
[" [7@) — a — > (a, cos ka + b, sin ka) | dz = 0, 
tiles k=1 


i _S ka + b; sin kx) | dx = 0 

as cos rx [7@) — a » (a; COs k | aca 
(7 =1,2,...,), 

hi “gin rx [7@) — a — >; (a, cos ka + by sin kz) | dx = 0 

7 ee (r=1,2,...,%), 


when the i ivativ left- and embe of (9. A) wit. 

h partial derivatives of the h m r 2 : 

respect to ao, @ and b, are equated to zero. Reference to the relations 
) Ty ve 

(9.2.2), and to the relations 


, * si = k # 0) 
. — os? ka dx = sin? kr dx = 3 ( f 
Loon ee I (9.2.6) 


then leads to the determinations 
, sat" ke dz = (bk ¥ 0) 
Qo = = [16 dx, a =— iM f(x) cos kx ; oa 
bk = in f(x) sin ka dx. 
wT j—« 


If f(x) is an even function, so that f(—x) = f(*), it is seen that b, = 0, 
so that (9.2.3) then reduces to 


f(z) ~ ao + Y, a, cos kx (9.2.8) 
k=1 


T 1 Tw 
where io = x [16 dz = ff f(x) da, 


a, = i f(x) cos kx dx = a f(x) cos ka dx (k # 0). 


Similarly, if f(z) is an odd function, so that f(—*) = —f(@), there 


follows ao = a: = 0, and (9.2.3) then becomes 
f(x) = >: by sin ka : (9.2.1 
k=l 


i 2 ya) sin ke de. (9.201 
Gao)» iets t | fle) sin ke de = 5 i f(a) sin ker ( 


(9.2.9) 
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If the periodic function f(x) is fairly well behaved, in particular, if only 
f(x) is bounded and piecewise differentiable, it is known that the approx- 
imation actually tends to f(x) as n > © for all values of z at which f(x) 
is continuous, and that it tends to the mean value alf(c+) + f(a—)] of 
the right- and left-hand limits at each point of discontinuity. 

It is important to notice that, as is typical of least-squares approx- 
imations by orthogonal functions, each coefficient is determined inde- 
pendently of all others, and its value does not depend upon the number 
of harmonics to be retained in the approximation. 

As an example, suppose that f(x) is defined over (—7,7) in such a way 
that 


0 (-r<2#S0), 
fl) =" (° are ») (9.2.12) 


and is defined elsewhere by the requirement that it be periodic, with 
period 2m (see Fig. 9.1). Since f(x) is neither even nor odd, the presence 


wla 


f(z) 





Fie. 9.1 


of both sine and cosine harmonics may be anticipated. Equations (9.2.7 ) 
Live 


m= Z| [oars [” ao + [" Tae] = % 
0 Oe —fr 0) # n/22 . ~ 16 
1 0 a/2 x * 
a =1| [ O-cos kx de + t cos kx de + [ 5 cos kr de | 
Tv —* 0 r/2 2 . 
1 kr 
= - (i — cos ) (k ¥# 0), 
1 0 a/2 ee 
n= 2! f Osin ke de +f asin ke de + [ 5 sin ke de 
=A 0 n/2 2 


- a (sin = ~ * 08 bn) 
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Thus there follows 


3r—od 1 1 ee 
f(z) ig = OOS ® — 5- 008 2x — gF cos 3x 


+23" sin 2 - isin 2e + 2 gin 3z — +++, (9.2.13) 
If the best least-squares approximation to f(x) involving only harmonics 
through the second were required, it would be obtained, by suppressing 
all higher harmonics in (9.2.13), in the form f(z) ~ 0.589 — 0.318 cos # 
— 0.159 cos 2x + 0.818 sin « — 0.250 sin 2z if (say) the coefficients were 
rounded to three places. 

Because of the discontinuities in f(x), a rather large number of terms 
would be needed, in this particular case, to afford a good approximation 
to f(x), particularly near the discontinuities. However, there are in fact 
many practical situations in which only the coefficients of certain har- 
monics of low order are required and in which the degree of approxima- 
tion afforded by a given number of harmonics is not of great interest. 

It is clear that if f(x) were not periodic, but were defined by (9.2.12) i 
the interval (—z,r), the expansion (9.2.13) would still be valid inside t 
interval, regardless of the behavior of f(x) elsewhere. 

More generally, if the series (9.2.3) were determined according to t 
formulas of (9.2.7), for any function f(x) for which the integrals exist, thi 
result of retaining a certain number of harmonics would afford the co 
sponding least-squares approximation to fi (x) over the interval (—1,r 
Outside that interval, the trigonometric expression would contim 
periodically, regardless of the behavior of f(z) itself outside that interv. 

Further, if f(x) is defined in (0,r), and if the coefficients in the appr 


imation 








f(x) ~ ao + ax, cos ka (0 <2 <7) (9.2.1 


k 


ie 


are determined by the equations 


1 [* 2 [* 
a= f(x) da, a = 2 | F(@) 00s ke ae (k=1,2,... 


the result will represent the least-squares cosine-harmonic approximati 
to f(z) over (0,7). Similarly, the least-squares sine-harmonic appr 
imation over that half range is given by 
n 
f(a) = y b, sin ka (0 <a <n), (9.2, 
kel 


seats ya 2 [ " $e) in hen dd, (9.2, 
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u gs 


a 0 Gh), 
/ sin jx sin ka dx = f (ja #20), (9.2.18) 
: 0 Gh), 
i cos jx cos ke dx = 7 (j =k #0), (9.2.19) 
x (jf =k=0), 


where j and k are nonnegative i 
gative integers. Th i 
deduced directly from (9.2.2). : pe benanar gare ate 
Bhs ee Discrete Range. We suppose again that 
; 18 Of period 27, but that its values are known i 
r, but only at a discrete 
equally spaced points in a period interval, say at the 2N + 1 neni " 


sg aN 8 a oe Rg gs 
N a a ae ea ca 
: oo mea me Since f(—r) = f(r), from the assumed periodic- 
vy,! we then have 2N independent data, which may be expected to serve 


to determine the coeftici i 
os, cients of 2N terms of an approximation of the 


If we denote the rth abscissa as 


a gst = 
t= Ta Pes Ty Ned es od OD aoa = a 
— (9.3.1) 
so tha e 2N independent values f, = i 
verify that only the 2N functions a — 


1 ' . 
» COS Z, COS 2x, . . . , cos Nz; sin x, sin 22, ...,s8in (NV — 1)z 


of the set considered in the preceding section are independent over th 

domain comprising this set of abscissas, for the function sin Nx ae hi : 
at each of these points, and each of the functions cos (N + 1)a ny 4 
sin (NV + 1)x, . . . takes on the same values at points in the apt a dee 


one of the 2N functi i i 
ila ions listed above. For example, since sin Nz, = 0, 


cos (N + 1)a, = cos Nx, cos 2, = (—1) cos z, = cos (N — lz 
It is possible to show that this i i ‘ 
set of functions is orthogonal und 
summation over the set (9.8.1) (see Probs, 7 and 8), so that, with ne 


t If f(x) is discontinuous at the ends of the period interval (mr), or undefined out- 


wide that interval, th 1 _ 
ho nanigned to f(w) ah both ie ot i Bebb cps Aa cee 
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notation of (9.3.1), 
x . 
sin jz, sin ka, = GAzk), 
r=—-N+1 
. j 9.3.2) 
cos jx, cos kx, = 0 (j zk), (9.3. 
r=—-N+1 
N 










2 sin jx, cos ka, = 0, 
r=—-N+1 


when j and k are integers between 0 and N, inclusive, in analogy to (9.2.2). 
Furthermore, in the excluded cases for which j = k, the results 


N 


ain? ka, = y costkt, = N (bk #0,N), 
r=—N+1 r=—N+1 (9.3.3 
N N 
1 = 2N, cos? Na, = 2N 
see r=—-N+1 


can be established, in analogy to (9.2.6). 
If now an approximation is assumed in the form 


n 


f(z) ~ Aot+ >, (A, cos kx + By sin kz), 


k=1 


(9.3. 


where n < N, and if the least-squares criterion 


‘ : 2 : 
> [ s@,) — Ay - > (A; cos ka, + B, sin ka,) | =min (9.3. 
r=—N+1 k=1 
ly analogous to that leading from (9.2. 


i derivation complete om (a 
re ance 3 d (9.3.3), yields the determinations 


to (9.2.7), making use of (9.3.2) an 


N N 
Ao = + f(tr), Ar = x f(x) cos ka, (k#0 
eas FP r=—N+4+1 
N N 
1 J 
An = a ys f(xr) cos Nay, B, = WN f (xr) sin katy. 
haw 20 N+1 r= —N+1 


i i i i by summation, 
Thus the coefficients in (9.3.4) are easily obtained 
the calculation of each coefficient is again independent of the caleula 
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of the others, and is independent of n, so long asm S$ N. Whenn = N, 
the least-squares criterion becomes equivalent to the requirement that 
the two members of (9.3.4) agree exactly at the 2N points specified by 
(9.3.1). 

The formulas (9.3.6) can be written in the more symmetrical forms 


Ao = ey a A ae at ta BS Me ot | Ne a + fui + 3fwl, 
2N 


Ay = a ibfw cos kx_y + f_n+1 cos ka_n41 + + + + +f 1 cos ka_y 
+ fo cos kro + fi cos kay + + + + + fy—1 cos kay_y 
+ sfx cos kry] (k #0, N), 
Ay = a [sf_w cos Na_y + fewest cos Na_w41 a aa (9.3.7) 


+ f_1 cos Nx_1 + fo cos Nay + fi cos Nai + °°: 
+ fv_1 cos Nan_1 + $f cos Nxy], 


1 ; 
B, = N [ef_w sin kx_y + f_w41 sin ka_yyit ++ > 


+ f-1 sin ka_; + fo sin kao + fi sin kay + + + + 
+ fy—1 sin kay_1 + ofy sin kay), 
in view of the relations f_y = fy. 
If we notice that the spacing h is given by 


h== 


N (9.3.8) 


we may observe the curious fact that Eqs. (9.3.7) are identical with the 
results of using the trapezoidal rule to approximate the right-hand mem- 
bers of (9.2.7), when k < N.+ 

For the purpose of numerical calculation, it is convenient to resolve 
/(x) into its even and odd components, by introducing the auxiliary 
functions , 


F(x) = aff(a) + f(—a)], Ge) = af) — f(—2)I, 
f(x) = F(x) + Ga). 


If we recall that x, = —z,, and that x) = 0, we find that Eqs. (9.3.6) or 


(9.3.9) 


wo that (9.3.10) 


| In this connection, it is interesting to recall that the Euler-Maclaurin sum 
formula, written in the form (5.9.1), reduces to the trapezoidal rule for any periodic 
function, with period equal to the length of the range of integration. That is, the 
‘correction terms” in that formula all vanish in any such case. This fact obviously 
‘loes not indicate that the trapezoidal rule is “exact” for periodic functions, since the 
error term (5.8.14) remains, but may indeed serve to illustrate the dangers associated 
with lack of proper regard for the error term in such formulas, 
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(9.3.7) may be reduced to the forms 
Ao = HGF t Fi + Fe+ se Ci + Fy_-1 + $F y), 


An = % GF + Fi 00s kay + Pa 008 kena + ae 


+ Fy_1 cos kty_1 + $F y cos key) (k #0, N), (9.3.11) 
2 —1)N2 Oe 
Ay =p GFo— Fi +F2— i oh Gea le ea all as 
B, = 2G sin ka, + Gy sin kag + + + + + Gy_i sin kay_1). 


In order to illustrate the use of these formulas, we consider the case 
N = 6, corresponding to the use of 12 independent ordinates. The 
tabular forms which follow are then appropriate (although further sys- 
tematization is clearly possible) : 


TaBLe 9.1a 











x Data cos x cos 2x | cos 3x | cos 4a | cos 52 
0 tho = oF o 1 1 1 1 1 
1 me 
t | Ati | § V3} 0 | -3 |-3v3 
1 1 
3 | Math P| o -3 -1 -3 | # 
; Lif, +f-s) =F: | 0 = 0 1 0 
1 M4 
a(fe +f) =F. | -§ -3 1 - z 
1 1 
| Meteo = |-V3) 4 gs | ak g 
7 fe = oF s -1 1 -1 1 |-1 
6Ao 

















In Table 9.1a, the sum of the entries in the data column is 6 Ao, where 
the sum of products of corresponding entries in the data column and t 
column headed cos kx is 3A, or 6As. Similarly, the sum of products 
corresponding entries in the data column of Table 9.1b and the colum 
headed sin kz is 3By. 

In illustration, for the empirical data 


° o 
6 0° 30° 60° 90° 120° 150° 180° 210° 240° 270° 300° 330° 360° 


f 





1.21 1.32 1.46 1.40 1.34 1.18 1.07 1.01 1.05 1.10 1,14 1.17 1.21 
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TABLE 9.1 










Data sin 2x 





sin 3x sin 4x 

















5 | Fh -fy-a@ | 13 1 1/3| 2 

3 | Be-td-G@ | aV3] BV] oo | avs 14/3 
v 1 

2 a(fs —f_s) =G; 1 0 -1 0 1 

2, a 

a | 8 -fo-m | tvs | -av3] 0 +V3| -E-V/3 
5a 1 1 1 1 5 1 

3 | t-fy=G | 3 -7 V3 1 |-3vV3| 4 

3B, 3B, 3B; 3B, 3B; 














the entries in the respective data columns of Tables 9.1a, b are 


0.605 
1.245 0.075 
1.300 0.160 
1.250 0.150 
1.195 0.145 
1.095 0.085 
0.535 
and calculation gives 
Ao = 1.204, A, = 0.084, As = —0.062, 


A; = —0.012, 
A, = —0.009, B, = 0.165, B, = 0.001, B; = 0.003, By, = —0.007, 
for the coefficients of harmonics through the fourth. 

In order to obtain a seven-point cosine-series approximation to a func- 
tion F(x) over the half range 0 < x < a, through harmonics of order not 
exceeding six, use would be made of Table 9.1la only, whereas for a five- 
point sine-series approximation to G(x) over the same half range, through 
harmonics of order not exceeding five, only Table 9.1b would be used. 
In any case, if all the available harmonics are retained, the resultant 
‘approximation takes on the prescribed value at each of the points 
employed in the calculation. Retention of a smaller number of har- 
monics leads to the appropriate least-squares approximation relevant to 
that set of points. + 

Tables corresponding to those given here, but employing larger sets of 
data and further systematized in various ways, may be found in the 


| For the cosine approximation, the result corresponds to the use of one- 


half weights 
with respect to the errors at 0 and N (se0 Prob, 9), 
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literature (see Whittaker and Robinson [20] and Willers [21}). A related 

procedure is described in §9.7. ; 
9.4. Exponential Approximation. 

mine an approximation of the form 


In certain cases it is desired to deter- 


f(z) = Crem + Cres + 0 + Cnet? (9.4.1) 
or, equivalently, of the form 
f(z) ~ Cit + Comet ++ + Cama, (9.4.2) 


ee es (9.4.3) 


It is somewhat more convenient here to work with the second form 


(9.4.2). We suppose that a linear change of variables a poet ae 
duced, in advance, in such eke hrairpiae fe ae ae 
a np ercenciner rennet for these values of x, the equations 
Ci t+ Cot: -° eae 
rae eae ae a cee (0.44) 


Cyp¥-? + Cop +o + Cat = fra 


would necessarily be satisfied, and the approximation (9.4.2) ne 
based on the result of satisfying these equations as een po 7 
If the constants m1, . . . , #n were known (or preassigned), eg ae 
comprise N linear equations in the n unknowns Ci, ... , On on 
be solved exactly if N = nor approximately, by the least-squares 
7 Le onta ae p’s are also to be determined, at least 2n oe 
are needed, and the difficulty consists in the fact that a ia: ion 
nonlinear in the y’s. This difficulty can be minimize yam 
similar to methods used in §8.14, next to be described. 
Let 1, . . - , Hn be the roots of the algebraic equation 


(9.4.5 


a = eerarer ny: —a,=9 
pe” — ony”? — ou” MET HR EP ee 


is identified with the produ 
Bee a ay pe ee the coefficien 
in tt a ie multiply the first equation in (9.4.4) by an, the seco 
equation ony Qn—1, - ++, the nth equation by a, and Ne * n y 
equation by —1, and add the results. If use is made of the fac 
each p satisfies (9.4.5), the result is seen to be of the form 


fa — aifanr— °° * — Gnfo = 0. 


Qi, + + 2 9 





APPROXIMATIONS OF VARIOUS TYPES 


379 


A set of N — n — 1 additional equations of similar type is obtained 
in the same way by starting instead successively with the second, third, 


. . ,(N — n)th equations. In this way we find that (9.4.4) and (9.4.5) 
imply the N — n linear equations 


Fn—101 + fn—202 + 7 NS + foan = fry 
Sno + fn—102 + he + fian = fata, 


(9.4.6) 


fu—2o1 + fvesaz + +++ + finian = fy—i. 


Since the ordinates f;, are known, this set generally can be solved directly 
for the n a’s if N = 2n, or solved approximately, by the method of least 
squares, if N > 2n. 

After the a’s are determined, the n y’s are found as the roots of (9.4.5). 
They may be real or complex. The equations (9.4.4) then become linea 
equations in the n C’s, with known coefficients. The C’s can be deter- 
mined, finally, from the first n of these equations or, preferably, by apply- 
ing the least-squares technique to the entire set. 

Thus the nonlinearity of the system is concentrated in the single 
algebraic equation (9.4.5). The technique described is known as Prony’s 
method. 

Obvious modifications are necessary when certain of the y’s (or a’s) 
are prescribed and the remainder are to be determined. When such 
constraints are imposed, and are to be satisfied exactly, it is essential to 
satisfy them (by using them to eliminate unknowns from the set of equa- 
tions to be solved) before applying the method of least squares. 

The most common situation of this sort is that in which it is known 


that f(x) tends to a finite limit (the value of which is generally unknown) 
asa— oo, The approximation 


f(z) = Co + Cre* + + + + + Cherm (9.4.7) 


is then appropriate, where the a’s are expected to have negative real 
parts. Since this approximation implies that 


Af(a) ~ fener + Pag a + Chet, 


where the coefficient Cj is an unknown constant which is simply related 
to the unknown C;, the equations (9.4.6) may be modified, in this case, 
by replacing each f, by the difference Afi = fis: — fr, after which the 
a's and y’s are determined as before. The equations (9.4.4) are then 
iodified by the insertion of the unknown C, in each left-hand member. 
At least N = 2n + 1 independent data are needed for the determination. 

If one or more of the y’s satisfying (9.4.5) are not real and positive, 
the corresponding values of the a’s in (9.4.1) will not be real. In partic- 
ular, if mw, is real and negative, say yy = pr, Where py is positive, the 
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term uz = (—p;)* is real only when z takes on the (integral) values for 
which data are prescribed, or values which differ from those values by 
integral multiples of the (unit) spacing. However, we may notice “7 
(—1)* = cos rz for any such value of x. Hence, if we replace a: : y 
pi cos mx or, equivalently, by e*!* cos rx, we so obtain a suitable inter- 
polating function which is real for all real values of «. aoe 
More generally, if one value of » is complex, and hence oe ein 
the polar form pe’, where p and @ are real and p is positive, then the con- 
jugate pe~® must also be involved, since the coefficients in (9.4.5) are 
necessarily real. The corresponding part of (9.4.2) can then be written as 


p?(A ye + A ,e7 6) 

where A, and A» are constants which must be conjugate complex in 
order that the expression be real when « is real. Hence, by writing 
Ai = (C; + iC2)/2 and Az = (Ci — iC2)/2, this part of the approxima- 
tion can be expressed in the more convenient form - 

p*(C; cos Bx + C2 sin Bx) = e7 "(C1 cos Ba + C2 sin Bx), (9.4.8) 
after the y’s are determined from (9.4.5) and (9.4.6), but before equations 
corresponding to (9.4.4) are formed and solved for the coefficients of the 


approximating functions. 
ae order to illustrate both the technique and the existence of unfavor 
able situations, we consider the attempt to recover the equation of t 


function 


f(z) = 2.32 — 1.08e-* + 1.20e- (9.4.9 


i i ients in (9.4.9) are exact. 
hypothesis that the numerical coefficient * 
values are given, to four decimal places, in the following tabulation: 


x 0 1 2 3 4 





f(z) | 2.4400 2.0851 2.1958 2.2692 2.3006 











If the ordinates are arbitrarily rounded to two decimal places, the requir 
differences of the rounded values are found to be —0.35, 0.11, 0.07, an 
0.03, and Eqs. (9.4.6), with f, replaced by Af,, become 


0.1la1 — 0.35a2 = 0.07, 
0.07a1 + O.1llas = 0.03, 


from which there follows a1 = #23 = 0.497 and a2 = —r$gy = —0.0487 
Equation (9.4.5) then becomes 


183u2 — 9p +8 = 0 
and yields 4; * 0.383 and py & 0,114, to three places, Thus the requ 
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approximation is to be of the form 


S(@) 


v 


Co + C1(0.383)* + C(0.114)* 


= Co + Cre0-88 + Cye-2-188, (9.4.11) 


after which the C’s may be determined by fitting the data at three points, 
or by use of a least-squares procedure over the five points for which data 
are provided. More nearly accurate determinations of the decay factors 
would have resulted from a reduction of inherent errors in the data 
employed, or from the result of using additional data to supply additional 
equations, and solving the resultant set approximately by least-squares 
methods. 

Suppose, however, that the values f(1) + 2.0851 and f(2) = 2.1958 
were incorrectly rounded to 2.08 and 2.19, respectively. We notice that 
the round-off errors so introduced are only slightly greater than those 
effected by the correct rounding, and we may consider these additional 
errors as representative of observational errors which could result if the 
data were empirical. The four relevant differences are then —0.36, 
0.11, 0.08, and 0.03, and the equations replacing (9.4.10) become 


0.1la,; = 0.36a, = 0.08, 


0.08a; + 0.1la: = 0.03, (9.4.12) 
from which there follows a; = 188 = 0.479 and a, = —23, = 0.0758, 


The equation which determines approximations to Mi and pe is then 
409? — 1964 + 31 = 0, which yields the complex roots ui,. = 0.240 + 


0.3617. Since, accordingly, us. e~ 1290-5154, the form replacing (9.4.11) 
here becomes 


f(t) = Co + e-1-29=(C, cos 0.5152 + Co sin 0.5152), (9.4.18) 


from which the C’s may be determined by collocation or by least squares. 

Whereas it is found that the coefficients in (9.4.11) and (9.4.18) can 
be determined in such a way that they both provide good approximations 
to the true function (9.4.9) for 0 <2 < 4 and, indeed, depart only 
slightly from f(x) for all x = 0, the latter approximation is oscillatory, 
while the true function and the former approximation are not. The 
slight additional errors introduced into the given data here lead to com- 
pletely incorrect information concerning the decay factors. 

While this example was selected deliberately to illustrate a particularly 
unfavorable situation, this type of “instability” is of common occurrence 
when it is necessary to determine the approximating coordinate functions 
themselves, in addition to the constants of combination to be associated 


with them, In such cases, it is particularly desirable that an error anal. 
yais be made, 
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Since here the true values of yu; and ws are e~! = 0.368 and e~? = 0.135, 
the true values of a; and a2 are e~! + e~? = 0.503 and —e~* = —0.0498. 
Thus, in the second calculation, errors of magnitude smaller than 0.006 
in the data employed lead to errors of about 0.024 and 0.026 in the cal- 
culation of a; and a2, respectively, and these errors, in turn, lead to com- 
plex approximations of the real yi and yw. The possibility of appreciably 
larger errors than those actually encountered in the calculation of a; and 
a, from either of the sets (9.4.10) and (9.4.12), assuming the coefficients 
to be correct to the places given, could have been predicted by an anal-— 
ysis of those sets.t Once such estimates are obtained, the maximum (or 
RMS) values of the errors 641 and dy, in the roots of u? — am — a2 = 0 
may be estimated, by use of the differential relation 


(2u — a1) du = wda, + das, 
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as 
1+ |u| 1+ |u| 
6 mg FT 6 max 6 max ~ [—— J] 6 max 9.4.14 
| | |y2 = Ha | a ’ | | [ue — ml | a| ( ) 
or . 
a/1 2 a/ 2 , 
(5u1) rus © rT (6a) Rms, (du2)RMs ~ rT (da) RM, (9.4.15 


with »; and m2 replaced by their calculated values, if those calculat 
values are real, and if the errors are small. The reality of wu: and 
depends upon the positivity of a? + 4a: and is in doubt if |a? + deol < 
2\2 + as| |Sa\max, When a; and a are estimated by their calculated valu 
Similar considerations apply to the more involved cases in which mor 
coordinate functions are employed. 

9.5. Determination of Constituent Periodicities. It frequently ha 
pens that an empirical function f(x) is known to be expressible as a lin 
combination of two or more periodic terms whose periods are unkno 
and are not necessarily commensurable, and the approximate det: 
mination of these periods from empirical data is often of considerab 
importance. 

If m distinct periods, denoted by 27/w, . . . , 27/wm, are known ( 
assumed) to be present, then f(x) correspondingly can be assumed to 
approximated by an expression of the form 


f(x) a Ay COS w1% + By sin 1X -f- eed + Am COS WmX + Bn sin Wm. 
(9.5.1 


But such an approximation is a special case of (9.4.1), in which n = 


+ The analysis of RMS errors relevant to the normal equations obtained in a le 
squares procedure is described in §7.3 [see (7.3.36)|. For the corresponding an 
of maximum errors when least-squares methods are not used, as in the present ¢ 
see §10.6 
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and in which the a’s are identified respectively with iw:, —2, 

wm, and —twm. Thus the desired values of w may be obtained by 
Prony’s method, if we set n = 2m and » = e* in (9.4.5) and (9.4.6) 
again assuming f(x) to be given fora = 0,1,2,...,N —1.t i 
Since, in this case, the roots of (9.4.5) are known (or required) to occur 
in reciprocal pairs (e+, e+), it follows that (9.4.5) must be invariant 
under the substitution of 1/u for », so that we must have aw = —1, 


Oom—1 = Ay. ~~ »Amy1 = Om-1. Thus, with = e”, Eq. (9.4.5) becomes 


52, = i(2m— : . . 
ei2me ae" m—l)o i 6 ee Ame" "te oe, Amen as, Am —1e'\m—De 


— +++ ae#+1=0 
or 
eime| (etme + em) ae ay(erm—De + e—t(m—1)0) sama a is te 

— Am—1(e + e**) — am] = 0 


and hence, finally, since e“ # 0, we find that the equation determining 
w is of the form 


2 cos mw — 2a; cos (m — 1)w — + + + — Qamn—1COS Ww — am = O. 


(9.5.2) 


Since cos kw is expressible as a polynomial of degree k in cos w, this 
equation can be expressed as an algebraic equation of degree m in cos w. 
Indeed, it can be expressed in the form 


Tm(COS @) — o1Tm—1(COS w) — + * * — am—1T'1(cos w) — dam = 0, 


in terms of the Chebyshev polynomials of §7.8. 
The N equations (9.4.6), which serve to determine the coefficients 


a1, +++, @m in (9.5.2), reduce (again with n = 2m, amix = Om—z, and 
Q2m = —1) to the forms 


(fi + fom—1)a1 + (fo + fom—a)az +--+ + (fmt + fmt) Qm—1 


+ fmOQm = fo + 2m) 
(fo + fom) oe + (fs + fom—1) a2 eye + (fm + fin+2)m—1 : 


+ Fin410%m = fi + fom+1y 


ee ee Se ee SS ey SY ae YT a 


(fn—2m + fr—2)ar + (fn—2m41 + fr—s)a2 + ee 
+ (fn—m—2 + fn—m)OQm—1 + fx—m—10m cs Sn—2m—1 + Sn-i1. 


(9.5.3) 


In accordance with the fact that the approximation (9.5.1) involves 3m 
unknown constants, we must have N = 3m. The set (9.5.3) then com- 
prises N — 2m 2 m equations in the m unknowns ay. 


| In the more general case, the dimensionless variable x accordingly represents dis- 
placement from a reference point in units of the actual spacing h, as in the preceding 
sootion, and the calculated periods are also to be considered as expressed in units of h, 
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This set is to be solved (approximately, by least-squares, if N > 3m) 
for the a’s, and the w’s are then determined from (9.5.2), after which the 


coefficients in (9.5.1) are determined (if their values are desired) by writ- — 


ing down the conditions which would require (9.5.1) to be an equality for 
at least 2m of the N relevant values of x, and solving that set approx- 
imately, by the method of least-squares, if more than 2m conditions are 
used. 

If, in addition, an unknown constant Ao is present in the right-hand 
member of (9.5.1), Eqs. (9.5.3) are to be modified by replacing each f, 
by Af;, and the constant Ao will then appear only in the set of equations 
determining the coefficients in (9.5.1). Here we must have N 2 3m + 1 
given data. 

As a simple illustration, we attempt to determine the constituent 
periods of the function 


f(x) = cos > + sin se (9.5.4) 


assuming knowledge only of the following rounded values of that function: 


x 0 1 2 3 4 5 6 7 8 9 10 











f(z) | 1.00 1.21 0.50 —0.29 —0.50 —0.21 0.00 —0.21 —0.50 —0.29 0.50 


If we suppose that the vanishing of the over-all mean value of f(x) is 
not known in advance, but that there is evidence (from physical consid- 
erations or otherwise) that the deviation from the mean is due to the 
superposition of two periodic processes, we first calculate the relevant 
differences: 


x 0 1 2 3 4 5 6 7 8 9 








Af (x) 0.21 —0.71 —0.79 —0.21 0.29 0.21 —0.21 —0.29 0.21 0.79 


Next, the six equations corresponding to (9.5.3) are written down, 


—0.920; — 0.79a2 = 0.50, 

—0.50a; — 0.2la2 = —0.50, 

0.29a2 = —1.00, 

0.08a; + 0.2la, = —0.50, 
—0.08a; — 0.2la2 = 0.50, 
— 0.29a2 = 1.00, 


(9.5.5) 


and the relevant normal equations are obtained in the form 


1.10920, + 0.8654a, = —0.2900, (9.5.6) 


0.86540, + 0.9246a, = —1.0800. 
The solution is found to be ay * 2.4092, a, # —8.4230, to four plac 
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after which (9.5.2) becomes 





2 cos 2w — 4.8184 cos w + 3.4230 = 0 





or 4 cos? w — 4.8184 cos w + 1.4230 = 0, Oe) 
and yields the values 
cos w; = 0.5186, COS we = 0.6860, 

from which the appropriate approximations to the true periods 

Pee 6. (Fes 

W1 We 

are found to be 

ari, _ my 

Pi = 10256 6.12, Py = 0.8147 ~ 7.71. (9.5.8) 


Hence the round-off errors introduced into the given data here correspond 
to errors of about 2 and 4 per cent in the calculations of P; and P,, 
respectively. The corresponding approximation to the governing equa- 
tion would then be obtained, if it were desired, by fitting the equation 


f(x) ~ Ao + A: cos 1.026x + B, sin 1.026x 
+ A; cos 0.8152 + Bz sin 0.815% (9.5.9) 


to the data by use of the least-squares procedure. 

Here, and in the general case, it may be noticed that only the value of 
cos w, is determinate. Thus, if we denote by a the admissible value of 
«, Which lies between 0 and z, we can conclude only that the proper 
approximate value of «, is one of the numbers +4, + 2rx (r = 0, 1, 2, 
.. .), So that, if the true physical spacing is h, the actual approximate 
period is known only to be one of the numbers 


Qh 2h 
Ge + 2rm Qn — oy, + 2rr 





(7 =0,1,...). 


Of these possibilities, only those corresponding to 7 = 0 can exceed the 
spacing h; the first [2rh/a,] exceeds 2h, whereas the second [2rh/(2r — Gy)] 
lies between h and 2h. The data employed clearly cannot be expected 
to determine periods smaller than the spacing with any appreciable 
accuracy, in general. Whether or not either of the two remaining appro- 
priate alternatives truly represents an approximate period could be deter- 
mined mathematically by investigating whether a second calculation 
based on a set of additional data, with a spacing incommensurable with 
h, also yields that alternative. In practice, the decision frequently can 
be based more simply on an inspection of the graph of the data or on 
physical considerations, 
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Situations in which two or more of the constituent periods aes anny 
equal are of frequent practical occurrence and are the most — —— 
ones. In such cases it is particularly important to retain = _ : 
many terms in the approximation and to use a sufficiently aha . 
data when the data are inexact. An interesting example m this hi « 
treated successfully in Whittaker and Robinson [20] (§175), by algae : 
which differs from the present one, and also in Willers [21] (§ ae ‘ya 
method equivalent to that given here. In that case, 600 pra enpan: i 
are available for the determination of two constituent perio . a 
are employed in the former treatment. Whereas the ae es a“ 
does not use all the 595 equations which could be forme Aes pes a 
(9.5.5), it first makes use of a selected set of 78 equations . ose fi ie 
involves the use of most of the data, and then checks the dieie s by 
recalculation using a judiciously chosen similar set of 17 te hed a 

9.6. Optimum Polynomial Interpolation with Selected ——e 
has been shown in earlier chapters that, ifa function f(x) is eicee - 
by the polynomial y(z) of degree n which agrees with f (x) atn + 1p 


Xo, V1, - + + » Un, WE MAY write 
n+1) 
f(a) = y(x) + r(x) ey (9.6.1) 
where n(x) = (x — o)(e@ — M1) °° * (x — &n), (9.6.2) 


ies i interval I limited by the largest and smallest of 
ee eu nde We suppose here that ey a a change of 
Wy) Sa 2 P 
i as reduced this interval to the interva jl). 
ekienee in the preceding chapter it was seen ae oe ae 
choices of the n + 1 abscissas lead to quadrature formulas al an 
desirable characteristics. In this section we investigate ls y : a 
class of interpolation formulas and single out a particular ie a 
is related to trigonometric approximation In the following sec ra viaaall 
Whereas the parameter é in (9.6.1) depends upon the n+ z abs —_ 
and the variable x, the nature of that dependence will depen: ae “a : 
upon the nature of the function f(z). Thus, if we desire i c 0 
abscissas in such a way that the error will tend to be as ras as p a 
over (— 1,1), in some sense, for the set of all such functions ea a 
continuous derivatives in (—1,1), we may attempt to nal rhe aa 
small as possible, in the same sense, over that interval, reca mer oom 
coefficient of the highest power of x in a(x) must be unity, from (9.6.2), 
In particular, we may require that 


[2 e@tr@y dz = min, 


where w(x) is a prescribed weighting function which is nonnegative 


(9.6. 
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(—1,1). If we notice that +(x) is expressible in the form 


w(x) = xt) + cyan + + + + + cor? + eye + 9, (9.6.4) 


and hence may be considered to be specified by the n + 1 coefficients 


Co, C1, . » » , Cn, We deduce that (9.6.3) leads to the requirement that the 
partial derivative of the left-hand member with respect to each c, must 
vanish. Since also 


On(x 
rt) (7 =0,1,... 


this requirement becomes 


,n), (9.6.5) 


2 [. w(x) ao w(x) dx =2 [we aerae =0 (r=0,1,...,n), 


(9.6.6) 
so that w(x) is to be that polynomial of degree n + 1, with leading coefficient 
unity, which ts orthogonal to all polynomials of inferior degree over (—1,1), 
relative to w(x). The abscissas of the n + 1 points, at which the agree- 
ment between f(x) and the polynomial approximation should be effected, 
are thus the zeros of that polynomial. 


It is of interest to notice that, with the interpolation polynomial so 
determined, the integral approximation 


Jo, w@t@) dx ~ [*, w(a)y(x) ax 


is the corresponding Gaussian quadrature formula of Chap. 8. 

Thus, in particular, if we take w(x) = 1, and so attempt to minimize 
the integral of the square of the error E(x) over the interval (—1,1), it 
follows from the results of §7.5 that the n + 1 abscissas should be the 
zeros Of Px41(x). Certain such sets of abscissas are listed in Table 8.1 
(§8.5). 

Further, if we take w(x) = 1/+/1 — 2, and so attempt to minimize 
the integral of [E(a)]?/+/1 — x?, the results of §7.8 show that the abscissas 
are to be the zeros of the (n + 1)th Chebyshev polynomial 7,41(2), 





Tn+i(t) = cos [(n + 1) cos“ a], (9.6.7) 
and hence are given by 
_ 2+ 1s Si 
2% = cos 3 9 =) @=0,1,...,n). (9.6.8) 
Since the coefficient of x* in T(x) is 2°-', it follows also that then 
w(x) = 2-"Tn4-1(2). (9.6.9) 


In addition, we may notice that the extreme values of w(x) in (—1,1) are 
then +2-" and are taken on (with successively alternating signs) at the 
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itional interior points, each of which 
ints « = +1 and at n additiona 
ee a pair of adjacent abscissas. Thus, with this ayes i: oe 
ae the coefficient of ft? (é)/(n + 1)! in the oe ier = 6. 
caciilattin with constant amplitude Ay as : pases ne pees aia 
her hand, since the coefficient o vv > ( 
re ane ae zeros of Pr4i(x) as the abscissas corresponds to the 
i ification 
identific 2+ (n + 1)!]? Pee. (9.6.10) 
7) = On Ft 
i lat « = +1 and the 
lynomial takes on the value = 
ay am ns . e 1, and P,4:(x) performs oscillations in (- 1, 1) , 
a fei that the n successive maxima and minima cuninete | pairs @ 
Aik zeros inside the interval decrease in magnitude died : : a ‘ 
i i he maximum abso 
interval. Thus, in particular, t ! 
week ms 6.10), over (—1,1), is given by the numerical ri in that 
piescen which is approximated by ae ak tia : ae 
it follows that, whereas use o n 
amas ae of x(x) over (—1,1), the use of the zeros of T'n+1(2) _ 
= alue of |(x)|max Which is smaller than that corresponding to : e 
: : : choice, by a factor which tends to increase in proportion rd n} as 
cee Furthermore, the error will tend to oscillate uniform y — 
( AD in tiie second case, whereas it will tend to sea be an ant a 
ae sci i 1 in the first case, on 
i ing toward the ends of the interva é 
a ae if it is desirable to control the maximum ee ni 
ees ce RMS oe the second choice will generally be preferable to 
cee it was discovered by Chebyshev that bi arg . ie hes: 
, . . . . op e : e 
} the maximum-error criterion is a 
eet a i the contrary, that there exists a 
st easily by assuming, on the ) » th 
arias #(x) of degree n + 1 (with leading coefficient untae rae 
ee absolute value in (—1,1) is smaller than 2. Then in me a. 
ce #(x) — 2-"T'n41(x) is negative at the maxima of Tn+i() and p 7 
. its minima. Hence, since 2-7 n41(2) takes on its er: ips 
(+2-") at n+ 2 points of (—1,1), including the ends, the differ 


za) — 2-"T'n41(x) must vanish at least n + 1 times. But, since this 
T —_ n 


i i less (the common leading term 
i e is a polynomial of degree n or less ( commo i 
pane are removed by the subtraction), this situation is impossible, and 
the desired contradiction is obtained.t 


. Ry: . . maxi+ 
+ It should be noticed that this result applies only to the minimization of the 


um value of |x(x)| over (—1,1). For any specific function f(z), hag presi fers 
pe value of the error r(x) ft)(£)/(n + 1)! in sri ie hg bdr persis 
i i f |f*(£)| generally w 
tly, since the maximum value o 
ithannedaati with an abscissa for which |m(x)| is greatest, 









eollocation for the determination of the C’s 
form (9.7.5). 
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9.7. Chebyshev Interpolation. 
detail the polynomial interpolatio 
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In this section, we consider in more 
n formula based on collocation at the 
zeros of T,4:(z). Since any polynomial of degree n can be expressed as 3 


linear combination of Chebyshev polynomials of degrees zero through 
n, it is convenient to express the polynomial y(x) which agrees with f(x) 


when t = 2, 21, ... » Xn, Where z, is the rth zero of Tn+1(), in such a 
form, and so to write 


n 


f(z) = » C.T(x) + Oa Pr+i(x) for (g), (9.7.1) 


k=0 


in accordance with (9.6.1), where |£| < 1 under the assu 
in(—1,1). The C’s are to be determined 
suppressing the error term is correct whe 


mption that zx is 

in such a way that the result of 

NX = 2X, 1, ... , ta, where 
2i+1 ; 

X; = cos es r) @=0,1,... > 2). (9.7.2) 


Whereas the desired interpolation polynomial could be expressed in the 


Lagrangian form of Chap. 3, the following alternative procedure is usually 
more convenient for its determination. 


If we introduce the change of variables 








x = cos 6 (0S6<7n), (9.7.3) 
the requirement 
f(z) = > CeTi(t) = (-1 <2 <1) (9.7.4) 
k=0 
becomes 
F(0) = > Crcosk6 (0<6<z), (9.7.5) 
k=0 
With the abbreviation 
F(0) = f(cos 6). (9.7.6) 
The C’s are now to be determined in such a way that (9.7.5) is an equality 
when 6 = 6;, where 
2i+1 ; 
= me : a= = 
9; = cos—! x; In +97 Lae | We Panne > n). (9.7.7) 
Thus the agreement is to occur at the e 


qually spaced points m/(2n + 2), 
Bn/(2n + 2),..., (2n + 1)r/(2n + 2), which are seen to be midway 
between the successive points 0, r/(n + 1), 2r/(n +1), . 


would have been employed in the procedure of §9.3, as the points of 
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results of that section, it happens that 
1 under summation over the n + 1 points 
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In analogy to corresponding 
cos j@ and cos ké are orthogona 
defined by (9.7.7) (see Prob. 22), 





is 0 (j #), 
cos 0, cos KO, = 4 ~ + 1 G=k#)), (9.7.8) 
= n+1 (j =k =0), 


where j and k are nonnegative integers not exceeding n. Moreover, since 


the left-hand member of (9.7.8) is identical with 


T; (ar) Te (ar), 


ipa: 


r 


it follows that, whereas 7'o(), T\(x), ... are orthogonal under integra- 
tion over (—1,1) relative to w(x) = 1/ a/1 — 2’, the functions T'o(x), 
Ti(a), .. +», T'n(x) are orthogona 
Tri(), with a unit weighting function. 

The truth of (9.7.8) permits us to deduce imme 


’s are expressible in the form 


1 under summation over the zeros 0 


diately that the requi 


n n 


1 2 
Co = a+1 y F(6,), Cy = n+1 ~ F(6,) cos k0, (k F 0), 
os (9.7. 


r=0 


where 0; is defined by (9.7.7), or, alternatively, in the form 


1 VN aX 
Co = a 7 1 d,s. C. = ag ae | Y, senate) (kA 0), 


where 2; is defined by (9.7.2). 
Thus, for example, we may construct 
has been made of the abbreviations 


A = cos 75 = V2 + V3 = 0.96593, 


5 cos 7 = 1V2— V3 = 0.25882. 


(9.7.1 


Table 9.2 whenn = 5. Here 


The dual headings permit the table to be used either with th 
er —1 < x S 1, with the unequally spaced abse 
or with the function expressed as F(0) 
spaced abscissas listed in the first colu 


expressed as f(x) OV 
listed in the third column, 
0 < 6 <7, with the equally 
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TABLE 9.2 
- f(z) x = Ti(x)) T2(z) 
F(0) cos 0 cos 26 cos 36 cos 46 cos 56 

qv 

3m | Mah | 4 2V3| 3 

2 V3| 32 z B 

as 

7 

5a 

| Penh | B -¢-V3| -i+/2 3 A 

7: 

m | wah |-B | 33] 3 

Dp aV3| V2 z —A 

3a 

7a Fs =f -3vV2 2 

4 5 zt 2 0 2 v/2 : 

i -1 2 
it z V/2 
2 Fe=fe —A +vV3 —1+/2 z —B 
6Co 3C, 3Cy 3C;3 3C, 3C; 











‘hus, for example, the coefficient of cos 46 in (9.7.5) is given by 
Ca = 3(QFi — Fo + 3F3 + $F. — Fs + iF), 
whereas the coefficient of 7',(x) in (9.7.4) is given by 
Cs = s(afi — fe + afs + afs — fs + dfs). 


t ) ar: 
monics INV oly ed are to be used. The result of retaining a smaller number 
Over the S1xX points. . : : 

C ae 
? 
Naw the S are deter mined the evaluation of the right hand member 


f(x) ed ; CiT 
py (2) (9.7.12) 


at intermediate points is facilitated b 
‘2 y the use of available t 
‘ hapa polynomials [see reference [273], in a zi i re 
Bes is tabulated instead and is denoted by C;(a)] rake 
Fees er aa of wee Series. It was seen in §7.8 that the 
res polynomial approximation to a f i 
over (1,1), where the integral of the product of 1/+/1 == wae 
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square of the error is to be minimized, is of the form 


f@) ~y@) = Yat) (lel <0, (9.8.1) 
k=0 





where | 
1p _1@ _2 f* f@) Tae) 
0” Ae Ji — 2 de, ar rs [ yi =a dx (k2 te Z 


The approximation so determined generally will not be identified with 
that of (9.7.12), since the coefficients, determined in the one case by sum- 
mation over a discrete set and in the other by integration over an interval, 


are generally unequal. However, the two approximations may 
expected to be of similar nature, in 


the sense that the error associat 
with each will tend to oscillate with uniform amplitude over (—1,1) 
whereas that afforded by the finite Legendre series arising from lea 
squares approximation with uniform weighting (§7.5) will tend to osci 
late with an amplitude which increases toward the ends of that interv 
on the average. Accordingly, if the smallness of the maximum error 
to be the governing criterion, it may be expected that a satisfacto 
approximation may be afforded by fewer terms of the Chebyshev seri 
than would be required for the Legendre series. 

For many functions, the evaluation of the integrals occurring in (9.8. 
is not readily effected. However, if a function f(x) is defined by a po 
series in x, for sufficiently small values of |x|, or if such an expansion 
readily obtainable, an alternative method suggested by Lanczos ( 
references [129] and [273]) is often useful. 

Suppose that one has the relation 


fla) =). Au + En(2), (9.8. 
k=0 


where it is known that 
\E,(x)| < €1 (9.8. 


is smaller than the prescribed error tolerance ¢«, whe 
so that the last term in the approximati 


(-1s28)), 


and that e 
|A,| t+ e.isnota tolerable error, 


f@) = (9.8. 


A pak 


=0 


= 


cannot be safely neglected. 

Now let the right-hand member of (9.8.5) be expanded in a series 
Chebyshev polynomials. Since that member is a polynomial of d 
n, the resultant series will terminate with the term involving T',(«) 
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hence will be of the form 
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n 
Ayax* 
k=0 


eT, (2). 
a an (2) (9.8.6) 


From the fact that the terms of highest degree in 7',(x) are given by 


T,(x) -2( 6 as Se en 
x zz a4 -), 


it follows that the result of i 
expressing the two i 
of decreasing powers of x will be of the form errant age 


(9.8.7) 





A,a” + A yr + A ae 
a n—20” + oce = 2-104 ( : n n— ° 
x 4 gr-2 + ary 
n—2 a n — 1 
+2 ans (z 1_ a ty...) 
a + Qn—-3 is 8 Be a ECS 
so that there must follow ore Rg ee) 
a, = 2-@-DA,, Qn-1 = 2-—) Ant, 
(9.8.9) 


ces deal 2 (4.5 + i Ay) 


: — n rarerauted large, the coefficients of T'n(x), Tn—1(z) 

haga in ( ea be small relative to the respective coefficients oF 

flee wad ‘ae Pe ] mt 9-8-2); for some m, and it may happen that 

Niort ft leet nate + an!) + 1 is smaller than ¢ and hence is 

pancetta e esired approximation to f(x). Since |7,(x)| < 1 

aa ae : a m terms in the right-hand member of (9.8.6) are tien 
gible, and the approximation (9.8.5) can then be replaced by 


n—m 


fa) ~ VY aTr(2), 


Ao (9.8.10) 


where m > 0, after which thi i 
is a i 
0 ari txpreasion of he'form pproximation can be transformed back 


a) ~ SA, 


7 (9.8.11) 


if this is desirable. In thi 
: In this way we obtain a polynomial a imati 
to f(x), over (—1,1), involving fewer terms than would be ridin 


truncated power series and tending to involve the smallest possible pum- 
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i ithin the pre- 
ber of polynomial terms which will supply an accuracy within p’ 


; nce limits. t ey the two 
She ee earras involved are facilitated by ae of 
foll th sets of relations, the second set being ied yetine the mem- 
67 Mike the first set being obtained by successively inve 


bers of the second set. 


To ons 1, 
l= a. T, = &, : 
C= ly a 2. 2 , 
y= $(To + Ts), - = re — 32, 
= 4(38T1 + Ts), . 


T, = Sat — 827 + 1, 

T; = 16a° — 20x + 52, 

T, = 32x — 48a + 182? — 1, 
T, = 64a? — 11225 + 562? —7 


at = 4(87) + 4T2 + 1), 
p= ye(10T1 + 57 + T's), 
28 = (101 + 1572 + OTs + Ts), 


= 2(357; + 217; + 77s + Tr), _ 6 + 160x! 
- = "Gor. + 567, + 287% ce aN 
+ 875 + Ts ’ 7 432% 
9 = ghg(1267: + 847; + 367s i a ac sashes + 
ae + 97; + Ts), 


(9.8.1 


i imation to é 
In illustration, suppose that a SE oa ~~ 
i of 0.01. 
ired in (—1,1), with a tolerance 0.01, es 
Sena he a polynomial approximation of degree five 
; Lx tha? + f0® + cea! + rot? y(2), 
em = 


(9.8.1 


with an error 


< = < 0.0038, (9.8. 


e 
|E(«)| = | 505" 720 | 
hich the neglect of the term «*°/120 would admit ei pane 
cae exceeding the prescribed tolerance. The use —= Pris 
intone in (9.8.12) transforms (9.8.13) into the equiva 
re 8. 


(c) = $470 + $307, + 302 + peels + x59Ts + ror0T (9.8. 
: = T,(z). Neglect of the last two terms will nro 
al = fet exceeding cis < 0.0058 for all x in we 1,1). us, 
Sat error smaller in magnitude than 0.0096, mR. ave 

et = 840) + feel + 53T. + verl's 


i in (9.8.12), 
after using the second set in ( a 
- ef = gix(382 + 3832 + 2082? + 682°) (j7l $1). ¢ 


in (9.8.3) is accep 
hould be noticed that, in this procedure, the error ane AO ae be 
ry ory and an efficient approximation to rm ia : mar ell 
dei jot i i lly not the best possi ; 
roximation obtained is genera 
differ little from it if |#,(a)| is small relative to « 





(9.8, 
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For the purpose of comparison, it may be noted that a similar manipula- 


tion gives the form 


y(t) = toPo + $88Pi + Ps. + DSPs + isPy + stsPs, (9.8.18) 


in terms of the Legendre polynomials P, = P,(x). Here only the last 
term could be neglected, so that a polynomial approximation of fourth 
degree would be required. 

The procedure described here, called ‘“economization of power series” 
by Lanczos, is useful in those situations when @ Minimization of the 
number of numerical operations is desirable. It clearly can be applied 
to any polynomial, whether that polynomial is obtained by truncating a 
power series or otherwise, once the interval of interest has been trans- 
formed to the interval (—1,1). 

9.9. Approximation by Continued Fractions. Newton’s divided- 
difference polynomial interpolation formula (2.5.2), with an error term, 
can be considered as the identity which results from writing 


I(x) = uo(z) (9.9.1) 
and effecting the successive substitutions 
W(%) = Un(te) + (a — Le) Un41(2) (k=0,1,...,—1), (9.9.2) 
with the abbreviation 
Un(x) = float, . . . ,te_1,2]. (9.9.3) 


Thus, for example, when n = 3 there follows 


J() = wo(ao) + (a — ao) {rr(arr) + (@ — 21)[ua(a2) + (e — t2)Us(x)]} 
= Uo(to) + (% — 2o)u(x1) + (a — a) (x — %1)U2(x2) + E(x) 
= flxo] + (x — xo)flao,ai] + (2 — ao)(x — 1)f[x0,%1,22] + E(zx), 
(9.9.4) 
where 
E(x) = (@ — ao)(x — 21)(x — x2)u;(x) 
= (% — ao) (x# — 2)(x — Lo) f[x0,21,20,2]. (9.9.5) 


The algorithm for the calculation of the successive divided differences 
follows directly from (9.9.2) and (9.9.3), with x = a, in the form 


flits, . « flxo, 


+ €r-2,0k] — flxo, . 
Le — Lr-1 


+ + yUR—2 Le-1| 
+ 9Ch—2,Te—1,T~] = z 2 : 





; (9.9.6) 
The result of assuming that the (n + 1)th divided difference 


Unqa(a) 
iw identically zero (or that the nth divided difference is constant) is the 
equation of the polynomial y(x), of degree n or less, which agrees with 


























———— 





















INTRODUCTION TO NUMERICAL ANALYSIS 
If unyi(x) actually vanishes 
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f(x) at the n + 1 points Xo, 41, +--+ 


identically, then y(t) = f(z). 
A great variety of other identities can be obtained in a similar way, 


and interpreted similarly as approximation formulas, by making use of 
other sets of transformations, in place of (9.9.2). In particular, the sub- 


stitution sequence 


Dai 


f(a) = vo(z), 


_ x — Lr t. 
v,(2) = Ve(tE) 1 ae) (k = 0,1,2,.. .) 


(9.9.7) 


leads to an interesting and useful result. We see that the first three 


substitutions give 





f(@) = v(t) = Vo(%0) + Se Vo(%0) + a= ao 
01(2) v3(z1) 4. On 
= vo(to) + eam (9.9.8) 
vi("1) + Ae 


x Xe 
V9(X2) + Usa) 


More generally, we are thus led to the continued-fraction representati 


x — Xo 
2-1 
xY— Xe 


a3 + ° 





f(a) = a0 + 





a 





Qo + 


where a, = (Xk), 


and where, when the fraction is terminated after n divisions, the const 
a, is to be replaced by @n + (a — tn) /Yn41(2) in the last denomina 
If we then set + = %x, where 0<k <7, the fraction terminates bef 
the residual (x — %n) /Vn41(%) 18 introduced. Thus, since (9.9.9) is 
identity, the result of replacing 1 /Vn4i(a) by zero (that is, terminating 
fraction with an) will give a function 7'n41(z) Which agrees with f(a) at t 
n +1 points %, . +» % under the assumptions that the const 
do, » + + y Gn are actually existent and that the portion of the trunca' 
fraction inferior tov — 2 does not vanish when « = 4%, fork =0,.- 
n—1. The result of this termination may be called the (n + 1) 
convergent (or approximant) of the representation, 80 that the first © 
vergent is merely the constant ao = f(%o). 
If we introduce the notation 





uy (@) = du Xo,"1y qfeese rors], (9.9, 
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so that (9.9.10) becomes 

















Qa = $1[X0,21, sa ie r—1, Le], (9.9.12) 
reference to (9.9.7) gives 
oolx] = f(x), dalzo,t] = o-— my . he 
Gol] — dolixo] — fe) — F(@o)’ 
20,212] — a2 , 
and, in general, nel eee 
drl 0,1, PS Lp—1,2] = v — Lr-1 
bi-a[to, . . . teat] — bealto, . . . ,te—2,2e—1] 
Accordingly, we have also (9.9.18) 
dxlXo, owe Ln—1, Lx] = Le — Ur-1 
dr—1[%o, » . « ,te-2,Le] — eilto, . . . /Ce_a,Le—1] 
(9.9.14) 


Th : : 

mae - Sia = oe first divided difference of f(x), relative 

1, $2[%0,%1,%2] is the inverted divided di 4 

: 2 ided difference of th 

irst divided difference gi[%0,x], relative to x, and a, . ae 

7 Fie pvc is the inverted divided differance of yale a 

L220], ive to a_1 and x,. For brevit: i fie quantiles 

y, we will refer t i 

de a by ce as a kth inverted difference of f(x) piiaaccnaneL 
ereas the definition shows that the i iff 

: n. show! e inverted difference 

: aioe as is symmetrical in its last two arguments 2x,—1 meer aoe 

ies : y symmetrical in its other arguments.t Thus it must ie . ae 

e specific inverted differences Px_1[20, . . - ,Xp—2,0 Vand _ 

oe See which possess its first k — 1 Arearnenta iy ee ae 

ing calculational arrangement is convenient for this eG so . 
se: 


xo f(xo) 

2, f(x1) bilxo,x1] 

2 f(X2) dilxo,xe]  pol20,21,29] 

x3 f(xs) dilxo,xs] elvo,r1,%3]  b3[%0,21,22, 23] 


Ilere, for example, we have 
%3 — U1 


$2 0,01,23] = . 
$il%0,x3] — $1[20,24] 





The diagonal element 
; 8 thus are th i 
Which appear in (9.9.9), e desired constants ao, a1, a2, a3, . 


tA rel i 
elated quantity, which possesses complete symmetry, is considered in §9.12 
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In illustration, for the given data 















2/0 12 3 4 5 6 
jz)\2 3 43 te a7 


; ar : : di 
we may number the abscissas In increasing algebraic order and, acco 
ingly, form the array 








z|f $1 pe os oa 
0|2 

1\2 -2 

2\4 —-3 3 

3 + —2 oo 0 
4|e, —74e 7 im. —5 
b\ae —3¢ -$ -# -5 
Gla, —-ik —e -s —5 


for which the fourth inverted differences are equal. cory a u 
only the first five points ro = 0, #1 = 1, v2 = 2, a3 = 3, an et a 
have dy = 2, a: = —2, G2 = 3, ag = 0, and a = —5, 80 at (9.9. 
becomes 

(9.9.1 





= rs(2), 


fle) = 2+ 





0+ a 


where the approximation would become exact if the last reer 
were replaced by the (unknown) quantity —5 + (x — 4)/ eS s ie a ; 
Thus 7;(x) may be expected to agree with f(x) at the five ei ol 
in its determination. But since the tabular array shows tha a 
approximation would be obtained if the abscissa « = 4 were rep re HK 
either z = 5 or x = 6, it may be expected that r,(v) will agree wi 


at those two points as well. ; 
Successive reductions will convert the right-hand member of (9.9.1 


to the simpler form 
‘ 2+2 


91 
r(x) = TW (9 
if this reduction is desired, and the agreement can be verified me 
Furthermore, the respective approximating convergents correspon ing 
termination of the fraction with ao, a1, and as, and hence to collocation 
one, two, and three successive points, are found to be 


—2 14 — 5x 


ro(a) - ‘S rg(X) = cr 


r(x) = 2, 
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However, since the present example is exceptional to the extent that 
a; = 0, the fourth convergent does not agree with f(x) at the four points 
x =0,1,2,and3. Indeed, this convergent is identical with re(x), which 
agrees with f(x) at x = 0, 1, and also at x = 3, but does not do so at 
x=2., 

9.10. Nature of Continued-fraction Approximations. It is easily seen 
that the nth convergent of the continued fraction (9.9.9) is expressible 
in the form 


a + ay + oe. + px” 








= = 10.1 
(O) 5 ae ae Te Bari «= (# = 2p) (9.10.14) 
if n is even, and is of the form 
(a) Ee i 1) (9.10.16) 


Bot Bia +--+ + Bpx? 


if n is odd. Thus the nth convergent affords an approximation to f(z) 
by a ratio of polynomials, that is, by a rational function of x, which gen- 
erally agrees with f(a) at the n points x, 21, . . . , tn—1if dni ¥ 0, and 
if all preceding a’s are finite. 

This situation is in accordance with the fact that, since the numerator 
and denominator of either form of (9.10.1) can be divided through by 
any one of the nonzero constants, the first form involves 2p independent 
parameters and the second form 2p + 1 such parameters, so that in either 
case n independent constants are available for the determination of the 
Approximation. 

Given a set of n distinct points, there cannot exist more than one 
irreducible rational functiont of the form (9.10.1) which takes on pre- 
scribed values at those points. The proof follows simply by first writing 
(9.10.1) in the form r,(z) = M,(x)/N,(x), where M,, and N,, are poly- 
nomials, and supposing that another such ratio, M,(x)/N n(x), takes on 
the same values as does r,(x) at n distinct points. Here, in accordance 
with (9.10.1), the degrees of M, and M,, cannot exceed n/2 when n is 
even or (x — 1)/2 when n is odd, whereas the degrees of N, and WN, 
cannot exceed (n — 2)/2 when n is even or (n — 1)/2 when n is odd. 
It then follows that the function M,,(2)N(«) — M,(z)N,(z) also vanishes 
ut those points. But, since this function is a polynomial of degree n — 1 
or less, it must therefore vanish identically, so that M,(z)N,(c) = 
M,(x)N,(a). Under the assumption that r,(x) is irreducible, M, and 
NV, possess no common linear factors. Thus all linear factors of M n Must 
also be factors of M,, and the converse is also true since M,/N, is also 
assumed to be irreducible. The same argument applies to N, and N,, 


| A rational function is said to be reducible if its numerator and denominator possess 
common polynomial factor, other than a constant, 
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so that the respective numerators and denominators can differ only to 
the extent of a common constant multiplicative factor, as was to be shown. 

However, there may be no such function. For example, if we attempt 
to determine directly a function of the form 


ao tat + acer? 


Bo + Bit 


which takes on the values prescribed in the preceding example at the 
x = 0, 1, 2, and 3, we must solve the simultaneous equations 





r4(x) = 


four points 


280 = ao, 
$(Bo + Bi) = ap t ai + oO, 
4(Bo + 281) = ao + 2e1 + 4a, 
4(By + 361) = ao + 3a1 + Yaz, 


which result from clearing fractions and equating the resultant membe 
at the four points, and we find that the general solution is given by ti 


relations 
ao = 8az, a, = —6a2, 


Bo = dag, Bi = —2ae, 
where a2 is arbitrary. Thus the assumed form becomes 


8—G6et+2? (4—2)(2—2) 
4-2 22-2) 


and is reducible to ro(x) = (4 — x)/2, in accordance with the re 
obtained from (9.9.15). The original form is indeterminate at x = 
whereas the reduced form does not take on the prescribed value at t 
point. Thus the defect of the fourth convergent of (9.9.15) is due to 
nonexistence of a form of the type required at that stage, rather than 
a failure of the determining process. 

In the case of (9.9.15), a warning was served by the fact that a3 = 
It should be remarked, however, that the kth convergent may be def 
tive, for the same reason as above, even though a:—: does not va i 
although this situation is an unusual one. In illustration, the data 





r4(x) = 


zx |O0 1 2 8 





f(z) |2 1 1 0 


lead to the inverted-difference array 


a|f ¢: 2 
2 

1 -1 

1-2 -!1 

0 


“4-4 -3 


owresoc 
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in which no diagonal element i 
eyecare vanishes. Whereas the corresponding 
faa x _ 12 — 182 + 32? 
ae x—1 6 — 4a 
sie = 2 


f 
3 


is properly exact at the four tabul i i 
pint aes ested ae ular points, the third convergent of the 


tile) 2 a 

—-1-—(@-1) 
and is undefined at the tabular poi 

point x = 0. It is easil i 
there exists no irreducible fraction of the form (a) + REY ae ae 
ech ea on the first three prescribed values Pere 
n the other hand, even thou i 
hand, gh the n given data serve to determi 
rational approximation of the form (9.10.1), the Gouliined faction eaoa 
sion will fail to exist, in the form assumed, if a, = ee 
Thus, whereas the data 
zx/0 12 3 4 5 


x 
2— ~ 
x 


«© forsomek < n — 1. 





f@™|1 15 § ts 
correspond to the function 


1l+2 
1+ 2’ 





S(t) = 


it is seen that [0,1] = ©, so that there exists no expansion, of the form 


x 
za—l1 


are 


f(x) = ao + 
a1+ 


2 


which takes on the prescribed 
whi values when a = 0, x; = lia 
, % = 1, and ze = 2. 
oe ea can be averted here by reordering the abscissas a 
ly : at the equal ordinates are not consecutive. Thus, if we take 
= = = 
0 )%1 = 2,x2 = 1,3 = 3, and a, = 4, we obtain the following array: 


tr -¥ -3 -1 -1 
Ye Po -2 =2 =1 


0/1 

2/3 —5 

Ljidl c 0 
3/%# —5 % 0 
1 

5 
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The additional line is included to illustrate the constancy of the fourth 
inverted difference in the present case. 
From these results we deduce the approximation 


x 

sak x — 2 

: a—l 

Of 
04253 








which properly reduces exactly to (1+ 2)/(1 +27). In this form, t 
successive convergents are 1, (5 — x)/5, 1, (5 — x)/5, and (1 + 4) 
(1 +2”). As was predicted by the presence of the zeros, the third a 
fourth convergents are both defective, in that the third takes on 

prescribed values only at the first and third points while the fourth d 
so only at the first, second, and fourth points. If, for example, t 
abscissas are taken in the order 1, 2, 3, 4, 0, we obtain the form 








x—1 
ln Re x—2 
2 64 x—3 
5 35 t— 4 
2 1 
5 


which naturally also reduces to (1 + 2) /(1 + 2), but which possesses 
defective convergents. 

In the usual cases, the ordinates can be introduced in any order. 
calculation near the beginning of a tabulation, it is usually desirable 
number the abscissas in increasing algebraic order, whereas near the 
of a tabulation the reverse numbering is desirable, in analogy to 
Newton forward- and backward-difference polynomial interpolation f 
mulas. Inside the tabular range it often is desirable first to introd 
the abscissa nearest the abscissa of the interpolant, and then successi 
to introduce abscissas at increasing distance from 2o, alternately fo 
and backward, in analogy to the central-difference interpolation form 
These choices tend to maximize the effective initial rate of conver, 
of the sequence of successive convergents in practical situations, for W 
the sequence generally does not terminate. 

9.11. Determination of Convergents of Continued Fractions. In p 
of basing numerical calculations on the approximation expressed 
truncated continued fraction, or of transforming that fraction to a si 
fraction by direct reduction, we may conveniently make use of ¢ 
recurrence formulas which may be derived as follows. 


APPROXIMATIONS OF VARIOUS TYPES 403 


re Lees (9.9.7), it is easily seen that f(x) is expressible as 
wo linear functions of any »,(z), say in the convenient form 


(& = &x-1)Px(x) + (x) Mi (2) 
(@ — x1) Q(x) + vi (x)Ni (a) 


In order to determine M,, N 
ky Ni, Px, and Qi, we may notice that i 
sequently also given both by the result ae replacing k by k + ere 


and also by the result of usin 
; g (9.9.7) t i 
(110) so tliat aRiec asa ee 0 express v(x) in terms of v,41(x) 


(x ov Lr) Pe41(2) + UE (2) M;, 1( 
(z — pies + ON mi 
— & — t)Mi(z) + re41(a)[aMi(e) + (x — x 
(x — a)Ni(x) + ve41(x)[a,Ni(2) + es - a 


This requirement is sati 
: isfied, f i : ; 
satisfy the relations or arbitrary v.41, if the desired functions 


f@) = 





(9.11.1) 








(9.11.2) 


Mrsi(t) = a,Mi(x) + (2 — ary_1)My-1(2) 
Nesi(t) = aNi(x) + (@ — ay_1)Ni_s(x) 
P(x) = My-1(2), 
| : Qi(x) = Np_a(z), 
in accordance with whi 
in x if Mo, My, No, sigan rate 
in the form 


(9.11.3) 


and Q; clearly will be I 
polynomial 
and N, are polynomials. Thus we may write (.11.1) 


Mi (x) + (@ — 21) Mi_1 (x) 











F(x) = gi [2o, Sees Ce1,2] 
Ni(a) + = tea) Nea)” (9.11.4) 
$x[o, one 8 Cr—1,4] 


Sine = i 
ince, when k = 1, this form must reduce to the form given by (9.9.7) 
fea gs ES, 
. = glx, x] 


we must have No(z) = 0, M = 
It is convenient to take N (2) aye oe Singh geune ore 


, ; 
hus M; and N;, can be determined by the recurrence formulas 


Mr4i(t) = a,Mi(x) + (x — Tr-1)Mi_1(2), 


Mj(x)=1, M(x) =a (9.11.5) 
and Nevi(x) = a Ni(x) + (2 — BDC Ye) 
No(x) = 0, Ni(«) = 1. ( (9.11.6) 
In particular, the kth convergent to f(«) is given simply by 
_, Mile) 
a(t) = HF Cy" (9.11.7) 























































404 INTRODUCTION TO NUMERICAL ANALYSIS 


The error associated with the approximation f pci oe can "i 
estimated by use of (9.11.4) if information with sage Ae ~ = a . F 
Xp_1,2] is available, say, in the form of sample values of the eh ae 
difference formed with 2o, . , and 2_1 as its first k argu . 


this purpose it is convenient to rewrite (9.11.4) in the equivalent form 





f(x) - re(%) = — 6 (ri(a) — rp_1(£)], (9.11.8) 
Gh Ne (9.11.9) 
oe x(t) = dxlXo, A oj te—-1, 0) Ne 


on, the sequence of convergents gen- 
? 


When f(x) is not a rational functi sa 


is infini i t tend to f(z) asn— ©. 
rally is infinite, and it may or may no _ 
it uenelly at least approaches f(x) more and more closely, for oe - 
value of x inside the range of the tabular values Bye sre ey fed . 
increases up to a certain stage. The determination of successive 7 
vergents is desirable in order that the rate of ‘effective convergence 
be.estimated. 
It is useful to notice that the exp 
alternative forms 


ansion (9.9.9) can be expressed in thi 


x — Xo 
a ss (9 
x— X11 
: a\Q2 
wv — Xe 
A203 


rea 





x — Xo 
u— M1 


x— x2 
a 
eos 


= ay + — 





f(x) = ao + 
ar 


ee a 





1+ 


if none of the a; vanish. The more compact symbolic arrangement 


v= 71 


ae + 


x— Xe 


a3 tees 


eer 1) 
ao + 


f(z) = a 





of the first form is often used. 


Approximation by rational functions is often useful in the neighborho 


of a point zat which the true function f(«) baakpsa? ane re 
i ly polynomial inte 

i it may be preferable to app ' 

ree ce — x)"f(x), where m is determined (analytically or “a 

ically) in an appropriate way.{ In illustration, the f roel ee al i 

is for the purpose of interpolating for cot 0.15 (= 6.6166) from gi 


three-place values. 
+ If f(a) becomes logarithmically infinite at xo, neit 
efficient. 


her type of approximation 
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te | Se $1 pe gs | 2 — 2X mM, Ni Tr 
0.1 |9.967 0.05) 1 0 —_— 
0.2 |4.933 —0.019865 —0.05| 9.967 1 9.967 
0.3 |3.2833 -—0.029700 —10.168 —0.15|—0.14799| —0.019865! 7.450 - 
0.4|2.365 —0.039463 -—10.205 -—2.70) —0.25| 1.00641| 0.15199 | 6.622 

—2.69511] —.0.40739 | 6.6156 


With this arrangement, the two factors in each product involved in 
(9.11.5) and (9.11.6) appear in the same horizontal line. Thus, for exam- 
ple, we have 


M:; = (—10.168)(—0.14799) + (—0.05)(9.967) = 1.00641, 
Nu = (—2.70)(0.15199) + (—0.15)(—0.019865) + —0.40739. 


In the calculation of the successive inverted differences, about one more 
digit was retained in each inverted difference than would be expected to 
be significant if all digits retained in the two preceding entries, from which 
it is calculated, were correct, when account is taken of the loss of signif- 
icant figures in the subtractions involved. The tabulated a’s are then 
treated as though they were exact in the calculation of the M’s and N’s, 
so that, for example, (at least) five digits are retained in Nx, even though 
not more than three of its digits would be significant if the value —2.70 
were correct to the places given, and only to those places. 

Because of the fact that errors in the given ordinates and in the a’s, 
M’s, and N’s enter into the determination of the required r’s in a nonlinear 
way, it is difficult to estimate in advance the number of digits which 
should be retained in each intermediate calculation, but it is usually 
desirable to retain at least as many digits as are required by the preceding 
rule. In the present case, the tabulated value of a; would be modified in 
its third digit if additional digits were retained in the calculation of 
preceding divided differences, and this modification would change M, 
and N, in the third digit. However, the calculated value of the ratio 
ry = M,/N, would be modified by only two units in its fifth digit. The 
deviation of the calculated value r, from the true value, by one unit in its 
fourth digit, is due principally to the round-off errors in the given data 
(wee Prob. 42). 

The fact that the value of the convergent r, itself is not sensitive to 
uppreciable errors in a; can be seen more directly by inspection of the 
actual truncated continued fraction: 


0.05 


—~0.019865 + 2 


~ 10.168 + 970 


r4(0.15) = 9.967 + 
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In this particular example, the near-linearity of the first invert 
difference ¢,[0.1,2] suggests the use of polynomial extrapolation oe 7 
three available values for the determination of ¢$,(0.1,0.15]. TI : use 
Newton’s forward-difference formula (retaining the second di rr 
gives $;(0.1,0.15] ~ 0.014923, so that there follows 


0.15 — 0.10 
F.15) — 9.967 


i } i igher order is avoided. 
d the calculation of inverted differences of hig 
Tt can be shown (see Prob. 38) that the kth convergent of (9.11.10) 
be expressed in the form 


~ 0.014928, (0.15) ~ 6.6165, 


= — ao)(% — a1) , (w — %o)(% — ai) (x — 42) 
r = Go + “NW. a ae ov + ° NiNs 
— see +(-1)* (w = to)(% — #1) + * (@ = Te—2) (9.11. 


pV e 











In the case of the preceding example, the kth convergent 7; is thus obta 
by retaining k terms in the sum 


0.05 0.0025 0.000375 
oats 0.019865  0.0030193 0.06192 





and successive results agree, to three decimal places, with the val 
obtained previously. 

9.12. Thiele’s Continued-fraction Expansions. Whereas the 
“inverted difference” xlXo, . - » :Tk-2,Tx-1,%x] of a function f(x) is sy! 
metrical only in its last two arguments, it happens that the quantity 


x2] 


Cra] + eee 


i ical i i ts. Here the last term on 

is symmetrical in all its k + 1 argumen . H 

right is ¢o[vo] if k is even, and is $1[%0,%,] if k is odd. This quantity 

often known as a kth reciprocal difference of f(a). 
In particular, we have 


polxo] = dolto] = f(xo), 
Li — Zo 


prl£0,21] = $:l2o, a] = far) — f(a)’ 


yk] + $x—a[Xo, » + + 
+ ox—a[to, - + - 


prlXo, eee Cn] = orl Zo, eee 


(9.1 


and calculation shows that 





1) 
cA ay (9.1 
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While an znductive generaliza- 


in which cases the symmetry is apparent. 
tion is possible, the following argument is somewhat more simple. It is 
easily verified that, when use is made of (9.11.5) and (9.11.6), the nth 
convergent of (9.9.9) is given by (9.10.la) when = 2p, with a, = 1 
and Bp1 = pop-ilto, . . . ,t2p-1], and by (9.10.1b) when n = 2p + 1, 
with ap = pop[to, . . . ,£2p] and 8, = 1. Thus it follows that po, is the 
ratio of the ‘“‘leading”’ coefficients in the numerator and denominator of 
the rational function of form (9.10.16) which agrees with f(x) at the 
2p + 1 points a, . . . , 2p, Whereas poy_1 is the reciprocal of that ratio 
for the rational function of form (9.10.1a) which agrees with f(x) at the 
2p points o, . . . , f2p1. These ratios are clearly independent of any 
ordering of the points involved. 
Since (9.12.1) implies that 


px[xo, od Ye Ne sx] > pro, eee Cr—2] = ox[Xo, . . tel, (9.12.4) 


reference to (9.9.13) shows that the successive reciprocal differences may 
be obtained by use of the recurrence formula 


PrlLo, » - + ,Le—aye—1,04] = 
te — Ur-1 


Tr—2,Lx] = pr—i[Xo, os 





+ pr—o[xo, . . . ,te_a]. 


(9.12.5) 
While this formula is less simply applied than (9.9.13), the symmetry 
of the kth reciprocal difference permits its calculation from any two 


(k — 1)th reciprocal differences having k — 1 of its arguments in com- 


mon, together with the (k — 2)th reciprocal difference formed with those 
arguments. 


Thus, in particular, a reciprocal-difference table may be constructed in. 
the convenient form 


x (xo) 


pr—1[Xo, of as a . Ur—2,Le—1] 


pil%o,%,] 
a. f(x 10,012 
7 f( ) pilX1,22] pal ee 2 PalXo,L1,%2,%3] 
Xe f(&2) p2[t1,22,%3] 
pilv2,a5] 
Xs f (x3) 


Irom this table we may determine the coefficients in (9.9.9) by combining 
(9.9.12) and (9.12.4) (this procedure is due to Thiele [19]), so that 


a = f(x), a = pilXo,21], Az = po[Xo,21,%2] — f(0), 


G3 = palXo,%1,%2,23] — pilXo, Xi], 


and so forth, Thus the required coefficients are formed from (but are — 
not identical with) reciprocal differences appearing in the forward diag- 
onal beginning with f(x). Furthermore, because of the symmetry, the 
(lata from the same table are available for the determination of formulas 
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in which the ordinates are introduced in other orders, by choosing “dif- 
ference paths” made up of suitable contiguous diagonal segments as was 
done in §2.5. Each such expansion is identical with the one which would 
be obtained more simply by the use of the inverted-difference array corre- 
sponding to an appropriate reordering of the abscissas, but only one array 
of reciprocal differences is needed for the formation of the entire set, 
Thus the use of reciprocal differences, rather than inverse differenc 
generally is advantageous only if several such formulas are required. 
However, the definition of the reciprocal difference is particularly usef 
in the important limiting case when the abscissas Xo, %1, 2, - 
become coincident, so that the requirement that the deviation betw 
f(z) and the kth convergent of the fraction vanish at k distinct points 
replaced by the requirement that the deviation and its first k — 1 deriv 
atives vanish at a single point x. Here (9.9.9) formally tends to the for 


%— Xo 
x — Xo 


(9.12. 





fix) = $0(%o) + 





gi(to) + 


x — Xo 
2(20) + $3(%0) aE 
where di(t) = lim — gulo, . tel, 
under the assumption that this limit exists fork = 0,1,..-- 


if the fraction is terminated after k divisions, it is necessary to rep 
de (ao) by $x(xo) + (a — %o)/d:lXo, » + - ,£o,x] in order to restore t 
equality. 

The consideration of this limit is complicated by the fact that the k 
arguments %, -- + >t are not symmetrically involved. Thus it isd 
able to use (9.12.4) to express (9.12.7) in the form 


. x] = pr—2[Xo, cee tr}, (9.12 


lim = { px[%o, - - 


d(x) = 
M0, 0003 Ck @ 
s9 that both terms on the right are symmetrical in their argument 


Accordingly, we have also 
x(x) = pr(t) — pr—2(X), 
with the additional abbreviation 


. sox). 


In addition, we have the relation 
Eo) 


tole) = Nim oe ad = onal wal! ON 
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from (9.12.4) and (9. : seed . ; 

aay ) and (9.12.5), and, if the limit on the right exists, it clearly 
: 1 ae 

pr—ilto, . . . ,teal ~ dpxalz, .. . a] 

OX,-1 aaa pe a 





(9.12.12) 


TO, +22s kl = 2 


in consequence of the sym i 
isnoien ymmetry in the arguments, so that (9.12.11) 


du(x) = (9.12.13) 


Pra (x) 


Thus we may evaluate the coeffici i 
ents ¢z(%0) appearing in (9.12 
successively, by using the formulas (9.12.9) and (9.12.13) a the ee 


pe(2) = peala) + dz), deale) = 27, (912,14) 
with the obvious starting values ae 
p-2x(2) = p(t) =0, — do(x) = f(x), (9.12.15) 


and evaluating the functions ¢(#) at « = a. The function p,(z) i 

mien Samad oe derivative of f(x). In pee with 
he terminology of t i 1 

eee oe a section, we may refer to ¢;(x) as the 
. In order to illustrate the calculation in a simple case, we consider th 

function f(x) = e*. By using successively the first and second aieten: 
in (9.12.14) with k = 0, 1, 2, . . . , we obtain the functions : 














go = &*, 
po = e, o1 = e*, 
pi =e", go. = —2¢e*, 
po = —e*, 3 = —8e™, 
ps = —2e™*, gy = 2e”, 
ps = e*, 5 = 5e*, 
andsoforth. Ifwetake 2x) = 0 in (9.12.6 th i i 
EE reese ( ), we thus obtain the coefficients 
Pope ee x a x 
mat ; =1+ 
x om = 
pe ol 
84440 — 3 d 
2+: 24> 


Inspection suggests that the inverse derivatives of e®, of even and odd 
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orders, are given by 
gon(x) = (—1)"2e* 


and the truth of this conjecture is readily established by induction. 

The expansion (9.12.6) is attributed to Thiele. It is related to t 
more general expansions considered previously as the Taylor-seri 
expansion is related to the divided-difference polynomial interpolatio: 
formulas. Whereas the nth convergent of the “confluent” expansiol 
(9.12.6) generally affords a better approximation to f(x) in the immedia’ 
neighborhood of xo, the corresponding convergent of a development whi 
yields exact results at 7 points of an interval including 2» is usually to 
preferred for approximation over that interval. 

The expansion (9.12.6) can be generalized usefully as follows. 
replace the independent variable « by G(x), and write 


G(x) = F(x), — pe(G@(a)) = Pea), —e(@(@)) = (2), 


the formulas of (9.12.14) become 


(n 2 1), dont 1(X) sae (—1)*(2n + l)e*, 


If 


G' (x) 





P,(x) = Py_2(x) + (x), By41(2) = (k + 1) Pi(x)’ (9.12. 
with the starting values 
P_.(x) = P_i(z) = 0, (x) = F(x), (9.12. 


and (9.12.6) takes the form [compare the Biirmann-series expan 
(1.7.10)] 








_ G(x) — G(xo) 
F(x) = Act Ge) - (9.12. 
Act we GCe0) 
where Ay = ®,(%o). (9.12. 


Here, if the fraction is terminated with Ax, A, is to be replaced by 


G(x) — G(x) 
on+ilG (ao), eee G(ao),G(x)] 





Ant 


if strict equality is to be preserved. 
The result of truncating (9.12.18), 
an approximation to F(x) in terms of a rational function of G(«), W 
may be expected to be useful near o. It can be used, for examp 
determine approximately the value of F(a) when G(a) takes on & 


and neglecting the residual, is 
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scribed value, if the corresponding value of x is unknown but is approx- 


imated by zo. The first f ) : 
equations ew @’s are readily found to be governed by the 


7 Gq’ Ul 
&) = F, f= e295 ee ae 

a F’ + 6 

(9.12.20) 
Thus, for example, if we take F(x) = e* i 
a Renta e F(x) = e*, G(x) = sin z, 2» = 0, we obtain 
sin x 
sin 2 


Soda 


e=1+ 
soe 


near x = 0. 


i : ‘ 

Sune ae if we take F(x) = x, we obtain a formula for inverse 

spe ean a . = Zo. For example, suppose that we require a zero 

ee at Xo 18 & previously determined approximation to z 
en Ao = F(x) = xo, (9.12.18) then reduces to the form 








E=24.+ —G(20) 
A) 
—G (x0) 
ie Cr re 
or to the more convenient equivalent form , 
X= 2X — pee) 
0 = ER , (9.12.21) 
_ __ @s(Zo) 
1 eegraeas 
where 1(%o) = ize) 4 
1\Lo Ai ? wn (Xo) = AyAc- (k > 1), (9.12.22) 


and where (9.12.16), (9.12.17), and (9 
wis \eehee td )s 12.19 i = 
Here the relations of (9.12.20) reduce to PU aI te 


>) = x, P, => GQ’, Be z= 2 op ®; = 3 a ee 
G 1+2@7e7" ~"*> 
and there follows sats 
G Gq’ G ” 
te ” ‘ qu 
Gq’ We q@ 2G” Oo = G@ (Z ag am) ~<a gh (9.12.24) 








<<< LLU 
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The first four convergents of (9.12.21) are thus found to be 
ZO = Xo, ZZ) = %y — 1, E38) = ty — 1 =a ] 
Aes (9.12.25) 
#4) = ao wi(l = #5) 


= 1 — We — 3 
where the w’s are to be evaluated at 2o.T 

In illustration, the equation x? — gz — 1 = 0 is easily seen to possess 
one real root %, which lies between z = 1 and x = 2. If we choose the 
crude first approximation x = 1, and set G(x) = a3 —z—1, there fol- 
lows Go = —1,G, = 2,07 = 8, Gi’ = 6, and the successive convergen’ 
are found to be 1, § = 1.5, # = 1.29, and $% = 1.34. If the process i 
iterated, starting now with a) = 1.34, the successive approximan 
round to 1.34, 1.3249, 1.324720, and 1.324718. The result yielded by th 
last approximation is in fact correct to more than the seven digits give 

Whereas expressions can be derived for the error of truncation ( 
Frame [87]), they are too complicated to be generally useful, and 0 
must attempt to estimate the error in a given approximant by inspectin 
the behavior of the sequence of preceding approximants. 

9.13. Supplementary References. For more elaborate techniques 
discrete harmonic analysis, see Whittaker and Robinson [20], Wille 
[21], and Danielson and Lanczos [68]. Whittaker and Robinson 4a 
include an application of Prony’s exponential approximation to f 
numerical solution of certain integral equations, as well as a treatment 
methods for determining periodicities, with collateral references. Ap 
cations of Chebyshev approximation include those of Lanczos [129 27. 
Miller [150], Gavurin [92], Grossman [98], and Sadler (201). 
aspects of “optimum” approximation, interpolation, and integration @ 
treated by Bernstein [35], Sard [207], Meyers and Sard [147, 148], Ha 
gon [103], Grosch [97], and Hastings [106, 107]. See also Olds [17 
For reference texts on continued fractions, see Perron [181] and W 
(231]. The reciprocal-difference methods of Thiele [19] are treated 
Norlund [13] and Milne-Thomson [11], who also present expressions f 
the error relevant to truncation of a continued fraction of Thiele ty 
For the use of continued fractions in the solution of equations, see Fra 
[87]. 

PROBLEMS 


Section 9.2 
1. If f(x) = sin x when sin z = Oandf(x) =0 when sin x < 0, obtain the expan 


1 , 2 (cos 2x cos 4a cos 6x 
f(x) ~ligene-2(pt+ecitena* se ‘). 

+ The approximation #® is that given by the classical Newton-Raphson proc 
(see §10.8), the approximation & is attributed to Halley, and the investigation of 
entire sequence appears to have beon initiated by Frame, 
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Also com i 
pare graphically each of the three least-squares approximations correspond. 


ing to retention of harmonics thr 
a : : 
ppeeraepcobr na ugh the second, fourth, and sixth with the true 


2. Obtain the expansion 





_m  4/fcosz , cos 3z 

a cos 52 

5 5 (S57 + Re 4 +-:-)  @s2s-). 

‘Renae ee é ; 

Pisani pisapd of this expansion, show that the series represents a triangul 

nantiirann ro) pian 2x which coincides with f(z) = |z| when |z| < 7, and eck 
ion. so compare graphically each of the first three ledaiiavuhies tee 


8, ¢ Pp g 
mation orres ondin to retention of harmonics throu h the first third an. 
( g h , d fifth, 


3. Obtain the expansion 





_ oy _ 8fsine , sin3. i 
ula — 2) = 2 (SEF + ey et. .) 


38 Ba (0S 27) 


ena fe ca ‘ 
ind sketch the periodic function represented by the expansion. Also co 
. mpare 


graphically each of the first three disti 
ae oe. e distinct least-squares approximations with the true 


4. Sh i 
ow that the square-wave function f(x), which is of period 2x and which is such 





that f(z) = —1 when —7z < =-+1lw 7 
< = 
x . xv 0 and f(x) 1 hen 0 << » possesses the 
4 /sin x sin 3x si 
n 5a 
f(x) = 1 + + + eee ) 


f d y = left- nat limi 8 
ind ver if that the expansion reduces to the aver age of the right: and ft: ha: d t 
of the function at its Pp nts Sco: ulty. Al Dp: Z phically the first 
O1 of di ntinuit so compare gra 
hree distinct least, Squares appr oximations with the true fune tion over ( 17 ). 
? 


5. Show that the mean inte q’ red error 0 wit. 
grated squa. ~d i 
| ) ( ) : : ass ciated ith the approximation 


x [rac — [a +4 y @ + 0p |, 


k=1 


he ; fa 
whereas the corresponding quantities associated with (9.2.8) and (9.2.10) ove (0,7) 
2. 2. r (O,r 


are given respectively by 
Lf gg ; 
a pide — (ap +4 > at) 
k=1 


1 v 1 . 
k=1 


Also use these results to calcul 
, ate the RMS e i 
mations considered in Probs. 1 to 4. Len rarer eoleentiop omg aba 


6. 8 i i 
uppose that y(«) is to be of period 2x and is to satisfy the differential equation 


and 


aey'V(@) te By! (wv) + yy (a) = f(x), 


whore a, 4, and ¥ are constants and f(w) in a apecified function of period 2r, Show 
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that, if an approximation to y(x) is assumed in the form 


n 
) (ax cos kx + by sin ka) 
k=1 


y(x) = ao + 










and is introduced into the differential equation, and if the coefficients are determined 
in such a way that the period integral of the square of the difference between the —_ 
sides of the resultant equation is as small as possible, then there follows 

(k 2 1), 


1 fr 
== d. 
iy oe x [7 1@ dz, (ak* — Bk* + y)ax = — [Ts@ cos kx dx 


1 [7 : 
_ == in ka dx. 
(ak! — ph? + yb = > [™_f(@) sin 
Also use this result to write down a series expansion of the solution of the equatio’ 


y" (x) + dy(x) = f(x) which is of period 27, when f(z) is the pronto weve fu 
defined in Prob. 4 and is a constant such that \ ¥ 17, 3%,..., (2k +1)7,.... 


Section 9.3 
7. By noticing that the relevant series is geometric, show that 





N sin Na 
fat #2 
ert = ak sin a/2 i vr), 
panel 2N (@ = 2vz), 
where » is any integer, and hence that 
N 
| cot = sin Na (a # 2y7), 
cos ra = 2 
r=a—N+4+1 2N (a = 2vz), 
N 
and sin ra = sin Na. 
r=—N+1 


8. By taking a = mr/N in the results of Prob. 7, where m is an integer, and wri 
at, = rr/N, show that 


a { 0 (m ¥ 2yN), 
2 COS Mir = | on (m = 2yN), 
r= —N+4+1 
N 
and d sin mz, = 0. 
r=—-N+1 


Then, by using the identity 2 cos jz, cos kz, = cos (j — k)a, + cos (j + k)a, and 
similar identities, deduce the results of Eqs. (9.3.2) and (9.3.3). 

9. Use the results of Probs. 7 and 8 to show that, if the range of summation 
changed tor = 0,1, ... , N in (9.3.2) and (9.3.3), and if the weighting function 
inserted in each summand, where 


3 (=0), 
wm mtd md) Be Np), 
t (r = N), 
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then the right-hand members of all formulas in those equations are to be divided by 
the factor 2. 

10. Suppose that the ordinates f_wii, . . . , fv—1, fy are empirical, with f-n = fx. 
and are subject to independent normal error distributions with zero means and a com- 
mon RMS value o. Show that the corresponding RMS errors associated with the 
coefficients calculated from (9.3.6) are given by 


(5Ao)nms = (6Ay)RMs = Van’ (6Ax)rus = (6By) Rus = Ui 
(k=1,...,N—1). 
11. Suppose that the ordinates Fo, Fi,..., Fy and Gy, Go... , Gy are 


empirical and are subject to independent normal error distributions with zero means 
and acommon RM§ valuec. Use the result of Prob. 9 to show that the corresponding 
RMS errors associated with the coefficients of the cosine approximation to F (x) over 
(O,r), calculated from (9.3.11), are given by 


(5Ax)RMs = ane 


(k=1,.. 


2N —-1 


(6A 0)RMs = (Ay) RMs = on? 


.,N-—1), 


whereas those associated with the coefficients of the sine approximation to G(x) over 


(0,7) are given by 
2 
(6Bi)rus = Vee. 


12. The following approximate values of a function f(x), known to be of period 2r, 
are available: 








x —T —5r/6 —2r/3 —7/2 —1r/3 —7/6 
f(z) | 2.077 0.278 -—1.014 —0.716 0.051 0.277 
x 0 r/6 r/3 a/2 27/35 /6 T 





Ja p$]aAaA apa 


1.015 3.031 4.759 4.680 3.689 3.032 2.077 





f(x) 


Assuming first that the given values are correct to the number of places given, deter- 
mine a trigonometric function of period 2x which agrees with f(x), to those places, at 
all tabular points. If it is known that the magnitude of the errors in all given values 
may be as large as 0.005, and that all higher harmonics are negligible, determine how 
many of the calculated harmonics can be neglected if the total error is nowhere to 
exceed about 0.01. If, instead, it is known only that the approximate ordinates are 
subject to error distributions with an RMS value of about 0.0025, and if all higher 
harmonics are again assumed to be negligible, use the result of Prob. 10 to estimate 
the RMS errors in the calculated coefficients. 

18. Determine a seven-term cosine-series approximation of period 2r to the function 


(x) of Prob. 12 over (0,7), and analyze the results as in Prob. 12. 


14. Using the following data, determine a five-term sine-series approximation of 
period 2x to the function f(a) over (Or), and analyze the results as in Prob. 12: 


«10 *r/6 r/3 w/2 2/2 5r/6 





J(v) | 0 1,186 0,864 4,002 6,060 2.868 0 
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Section 9.4 os onl 
15. Suppose that approximate data are available for a function known to ‘i 0 . 
form F (t) = Ae, where A and b are unknown constants. Show that the change 


notation 


log F(t) =f, log A =c 


leads to the linear relation f(t) = c + bt, after which the er ee he methoea 
i determination of c and b, and hence o and b. Apply 
cohen te i irical values of F'(é) are given, and deter- 
d in the case when the following empirical values ; c 
ae the result is consistent with the hypothetical fact that the errors in th 
given data do not exceed 0.0002 in magnitude: 


15 18 21 24 af 


t| 9 12 





F(t) 0.5820 0.4622 0.3672 0.2920 0.2320 0.1843 0.1463 


16. Increase each of the given ordinates in Prob. 15 by ran te 
' i stion G(t). Show firs a e a 
t ordinates correspond to a function | : t 
age ae does not lead to an approximation consistent ile arpa a 
d i he true function is oO e 
. Then, assuming knowledge that t i j , 
aan Aot+t Are’, but without making use of any other information, use Prony 
e i i Ao, Ai, and b. 
th z = (¢ — 9)/3] to approximate Ao, A1, ; ; 
Grea the modified data of Prob. 16, assume an approximation of the fo 
Gt) = Ao + Axedit + Ave, and use Prony’s method to ae 
in i i tained, so that the third term is 0 
that a negative value of e is ob 5 I : 
piles San at successive tabular points, and hence presumably is to be interpre’ 
as “noise” in this case. (Take care to retain sufficiently many digits.) 


Section 9.5 . , a 
18. Repeat the calculations of the illustrative example, using given values 0: 


ded correctly to five decimal places. ; ; _ 
ar "The following data represent observed values of a certain physical quantity: 








t|0 0.05 0.10 0.15 0.20 
Ft) |0 0.954 1.527 1.502 0.918 
t| 0.25 0.30 0.35 0.40 0.45 0.50 





F(t) 


The errors in measurement are known not to exceed 0.0005. Theory predicts be 
true function F(é) should satisfy the differential mace aay oe he Be 
: f perio = 0.472. 

+ 1.40 and k = 248, and hence should be o } ” 

a believe that the differcans G(t) between the true aed daybircit bit 
i i f the form G’’(¢ c = 
bject to measurement satisfies an equation of th ‘ 

connie c. Investigate the plausibility of this conjecture, and approximate the 
of the perturbation G(2). 





0.030 —0.752 —1.090 —0.833 —0.091 0.814 


Section 9.6 


i = — 1) in (1,1), relevant to a 
0. Plot the function r(x) = a(w? — 1/4) (a? : | 
wien itetesiesiia employing data at equally spaced points, together with the co 
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sponding functions associated with data prescribed at the zeros of P;(x) and T(x), on 
a common graph. Also determine the maximum and RMS values of these functions 
over (—1,1). 

21. Use the Lagrange interpolation formula to determine three parabolic approxi- 
mations to f(z) = e? over (—1,1), such that y: (x) agrees with f(z) at x = —1, 0, and 
1, y2(z) agrees with f(x) at the zeros of P; (x), and ys(x) agrees with f(z) at the zeros 
of T;(x). Show also that the errors can be expressed in the forms x(x? — 1)e%/6, 
a(a? — 3/5)e%/6, and x(x? — 3/4)ets/6, respectively, where each é is in (—1,1). 
Calculate the actual errors in the three approximations for x = —1.0(0.2)1.0, plot 


them on a common graph, and compare them with respect to approximate maximum 
and RMS values. 


Section 9.7 
22. Derive (9.7.8) by first obtaining the intermediate results 


n : . 7 
») deine | ei(ntiya S10 ot a ee ee 
r=0 (—1)’(n +1) (a = vr) 
_ 0 [Im # 2v(n + 1)], 
ie Zee (—I)"@ +1) [m = 2m +1), 


where a = mr/[2(n + 1)] and 6, = (2r + 1)r/(2n + 2), and then using the identity 
cos J6, cos k#, = 3 cos (j — k)0, + 2 cos (j + k)6,. 
23. Determine, to four decimal places, the coefficients in the approximation 


5 


et = y cr Tx (x) 


k=0 


(jz| S 1), 


if the approximation is to be exact at the zeros of T(x), and show that the magnitude 
of the error is smaller than e/23040 = 0.00012 everywhere in (—1,1). Also, recalling 
that |7.(x)| S$ lin (—1,1), obtain upper bounds on the errors relevant to the (weighted) 
least-squares approximations of degrees two, three, and four, obtained by truncation, 
und use Eqs. (7.8.10) to express these approximations in explicit polynomial form. 


Section 9.8 


24. Determine two third-degree approximations to e? over (—1,1), in addition to 
(0.8.17), by truncating the Maclaurin expansion of e* instead with the x‘ term ana 
with the z* term, and proceeding by the Lanczos method. Also compare the error 


hounds associated with these approximations with each other and with the correspond- 
ing approximation obtained in Prob. 23. 


26. Obtain an approximation of the form 


cos = Ay + Age? + Aya! 


with an error smaller than 5 & 107° over (—1,1). 


26, Show that the polynomials 7(2¢ — 1) play the same role over (0,1) as do the 
polynomials 7'y(@) over (—1,1) and, with the abbreviation 


Ty = Tie) = 1(2e — 1), 
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obtain the relations 


M21, T=2-1, Tr = 82-8 +1, T, = 32x — 48% + 18¢ — 1 
; : T, = 1282' — 256x* + 1602? — 322 + 1, 
T, = 51225 — 128024 + 1120x* — 400z? + 50z —1 
and 


22 = To ca T,, 8a? = 37 ar AT, ao T., 
322° = 107 + 1571 + 67: + Ts, * 
12824 = 357%) +5671 + 2872+ 87, +75 
5l2x5 = 1267 + 2107, + 1207, + 457's + 107% + Ts. 


1=T», 


27. Use the notation and results of Prob. 26 to obtain a third-degree eh 
approximation to ¢~* with an error smaller than 0.001 in magnitude a ( ), dof 
28. After expressing a five-term (eighth-degree) truncation of the series rep 
sentation 
sin £ 
x 





x? x4 eee 
=1-3 +57 


in terms of the polynomials T(x?) O Sk s 4), by use of the results of ee ; 

obtain a polynomial approximation to sin 2, involving as few terms as possible, W 
rror smaller than 1075 over (0,1). — : 

90 The modified Bessel function Ko(x) possesses the asymptotic expansion 


Qn _ 2 | (1-3)? _ 
Ve *Ko(e) = 1 — Tgp ~ 21(8x)? 


in which the error of truncation in the right-hand member is ee ee the 
neglected term. Show that truncation with the z* term corresponds to an & 
smaller than 0.0008 for all a 2 3. Then, after expressing the result of that “ae 
in terms of the functions 7T.(3/x) (0 Sk $5), obtain an approximation ee e P 
type over (3, ©) involving as few terms as possible, with an error not exceeding U. 
over that range. 


(1-3-5)? 





Section 9.9 


30. Show that an inverted difference of the sum u(x) + v(x) is generally oa eq 
to the sum of the individual inverted differences, that multiplication of f ey y 7 
stant corresponds to multiplication of the mth inverted difference of $0 y Y 
even and by 1/c if is odd, and that the addition of a constant to f(z) does no 
its i ifferences. 

A aa ar successive inverted differences di[l,2], oa{1,2,2], ¢a{1,2,3,2], . 
for the functions 22, «~?, and z — 2}. Then deduce the identity 








-1 
e=lt+7 se ges ? 
=4+ i] 
3 4d 2 
7 


together with corresponding identities in the other two cases, and verify their 


rectness. ’ : 
32. Form an inverted-difference array and use it to determine a function 


by a finite continued fraction which takes on the following values. Also express 
result as a simple fraction, 
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z/0 12 3 4 6 
@ir ttt th F 


33. Replace the given ordinates in Prob. 32 by their three-place rounded values and 
repeat the determination, retaining an appropriate number of digits in the inter- 
mediate calculations and obtaining a function defined as a simple fraction whose 


values at the six given points round to the three-place values used. Compare the 
result with that of Prob. 32. 


34. Proceed as in Prob. 32 with the following data: 


z|0 1 








f@|0 -3 © $3 x 
Section 9.10 

35. Determine a rational function of the form (9.10.1) , with n < 5, which takes on 
the values 


za{/1 23 4 5 








f(z) |1 2 3 3 2 


or prove that no such function exists. 
36. Show that the substitution sequence 


J(x) = wo(2), 


Woe(t) = Wee(wor) + (x — wox)wWees1(z), 


© — Loeq1 


Weepi(X) = Weegi(Lery1) + 
* ? Wek42(x) 


generates the representation 


(x — 20)(x — a1) 


athe ee 


f(z) = f(xo) + bi(x — ao) + 





> 





where Are = Wor(ox) aNd bery1 = Wee4i(ee41), and that the a’s and b’s can be deter- 
mined as leading elements of columns comprising alternately first divided differences 
and first inverted differences relative to corresponding elements and leading elements 


of the preceding column. Investigate the form of the nth convergent, and illustrate 
the procedure by use of ordinates of 


_ 6 — bz + 322 
NO a he ha 


atc = 0,1, 2,3, 4,and5. Determine what representation would result if the defini- 


lions of wor(z) and wor4i(z) were interchanged. Determine what substitution 
sequence would generate the representation 


fla) = (we) + bile — a0) 


of (x — 20)(4 — 21) 
a, + bs(@ — a2) + ale — as)(@ — ms) + Ove —m)@—-aJ@-ar-- 


Section 9.11 


$7. Use the recurrence relations (9,11,5) and (9,11,6) to obtain the successive con- 
vergents ry @ My/Ny relovant to the illustrative example (f(a) = cot a, wo = 0.1, 
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a1 = 0.2, t2 = 0.8, v3 = 0.4] as follows: 


—0.29799 +2 1.03656 — 0.20100r —2.70932 — 0.055292 + 2? 
—0.019865 ” 0.00199 + x 0.00059 — 2.7199x 





9.967, 


Also verify that the kth convergent agrees appropriately with cot x at the k appro- 
priate points, evaluate the successive convergents at x = 0.05, 0.15, 0.25, 0.35, 
0.45, and compare the predicted values with the rounded true values. 

38. By eliminating a; between (9.11.5) and (9.11.6), show that 


Mayr — Me _ eer Nilo Mi _ w=) 
Nev Ne BV Newt \Ne Ni- 


and deduce the relation 
(a — ao)(e — 21) + + + (@ — te), 
N aN k+1 


Thus, with the notation rx = M;/Nx, show that the (k + 1)th convergent of the ¢ 
tinued fraction (9.9.9) can be written in the form 








Mayr Mi; 
_ Mk _ (_4)e1 
New =-Ne io) 


— xo)(e — m1) + + @ = Sn), 
Nua) Nay) 





k 
rryi(z) = ao + y (—1)"t (a 
n=1 


39. Use the result of Prob. 38 to show that the kth convergent obtained in Prob. 
ean be obtained also by terminating the expansion 


= 2-01 _ (a — 0.1)(a — 0.2) 
cot x ~ 9.967 + (9 919865) — (—0.019865) (x + 0.00199) 
ae (x — 0.1)(% — 0.2) (x — 0.3) 
Ge + 0.00199) (0.00059 — 2.71992) 





—s 





with the kth term. 
40. Use results of Prob. 38 to show that the error expression (9.11.8) can be writ 


in the form 
an (x) / N;.(2) 


f(a) = re(v) = (-1)* (x = tn) Nr (2) + Ni(x) bulxo,21, es 





. ,Ck1,0] 


where wz(x) = (a — to)(w — 21) °° * (a — Lx-1). 
41. Assuming knowledge of the fact that cot z becomes infinite at + = 0, verify 


the introduction of c = Oasa fifth abscissa in the text example leads to the app 


mate information 
$3{0.1,0.2,0.3,0] ~ —3.00, 


so that ¢$3[0.1,0.2,0.3,2] varies from about —3.00 when « = 0 to about —2.70 
that that function increases steadily over 


7 = 0.4. Under the assumption 
interval, use the result of Prob, 40 to show that the error in the calculation of cot 
from the fourth convergent would beless than about 0.0006 if no round-off errors 


involved. 
42. Repeat the cal 
place) approximate ordinates: 


z | 0.1 0.2 0.3 0.4 


culations of the text example, using the following improved (f 


————— 
cot « | 9.0666 4.9332 3.2827 2.3652 
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43. Deal as in Probs. 37, 39, and 41 with the results of Prob. 42 


44, De termine values of the first four appr oximate conver gents to Ky (0.3) = 3.056 
? 


where Kyi(zx) 1s a modified Bessel function introducin: succes ively the follow Nn, 
? £ S: h ing 


t|0 02 04 06 0.8 








Ki(t) | © 4.776 2.184 1.303 0.862 


45. De asin F robs 7 9 and 41 wi It of ‘ob 44 obt uc. 
al S 3 ? 3 ) th the results Pr . , alning as m b 
ev idence as possible with r egar d to the accuracy affor ded by the four th conver gent 


over the interval (0,1). Also verif: i 
i 3 y the concl i 
values with the following additional rounded ae errata 


z|0.1 0.3 0.5 0.7 0.9 1.0 








Ki(x) | 9.854 3.056 1.656 1.050 0.7165 0.5098 


Section 9.12 


46. : ; 2 
Construct the following reciprocal-difference table from the given ordinates: 





x 
Bg eA Siena 
‘ —2 
1} =z 5 
10 
a -2 
Qi 4 1 
oy 0 
=e 14 
3] 2 1 
z ou —Ts 0 
a Si 66 
4| a7 a 
Par re 
5| as 


PR ; . 
rom this table, rederive (9.9.15) and also obtain the representation 











ro(t) = 2 + i z- 2 Q4+2 
0 = = , 
rr a 71 ol 1+ 2? 

cee han ee 

20132 ae —4 

3 1 

4 


corresponding to a zigzag difference 
path launch = 
47. Obtain the formal expansion ies aa 


= t & & 
P +2434+1454+.---. 
so show that ¢en(x) = 2/n (n= 1 
(2n)th coefficient is 2/n and the (2n oe aid 1. iwc 


48. Use (9.11.5) and (9.11.6) to sh i 
ota es oma show that the leading convergents of the expansion 


2x 6x + a? 6 2 
0, &, | , ot tal, 6x + 32% _  30r + 2iz* +2 60x + 602! 4 
+a 6 + 42’ 6 + 62 + al 50 + Ba Oa" GO + 00s FBG aa 


log (1 + a) =i 
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and determine the successive approximations to log z and log 2. Also use the result 
of Prob. 38 to show that these convergents are the partial sums of the formal expansion 








= x? x8 = x! 
log 1+2)=O0+2-a T+ @pnO@+e) OF 4x)6 + br +24) 
Phd a8 
+ (6 + 6x + #?)(30 + 36x + 92?) = (80 + 362 + 92)(20 + 30x + 12x? + 23) 


+ 


49. Obtain the formal representation 





hdc ie a Mees za-l «2-1 2-1 
4 2+ 1+ -6 + -~$ +°-°> 


and express the first five convergents as simple fractions. Also use the result of Prob. 
38 to show that the representation can be expressed in the form 


e-1_ (-1)?_ (a — 1)8 
2 2i+2) 4 +2)(2 +2) 


(« — 1)* 


+ 4@+a)0 +2429 
+: a 


tania = 





+ 


IA 


60. Obtain the formal representations 


@+Mact eS te4 


Sag en ees A, 
and @+ejyt=- Qi + 2+18i+ 2+---, 
3c 15¢ 


where c is a positive constant. 
51. With the notation 


1 
Ay = ki f(x), 


use (9.12.14) to show that the first five Thiele coefficients in (9.12.6) are given by 






= Ao, @ as ces Sie en. ee a= (4:45 — 3 
PR Oa ee Ahh AD Ax(ArAy — A) 


where az, = dx(xo). Deduce also that if a function possesses the formal Tayl 
expansion Ay + Aiu + Ayu? +--+ and the formal Thiele expansion do +3 


7 Wi wie ds then the leading a’s can be calculated from the leading A’s by use 
2 ? 
these relations. 

52. Show that the expansion (9.12.6) is nonexistent when f (2) = cos « and ao = 
but that, when F(z) = cos x, G(x) = 2, and 2 = 0, the leading terms of the exp 
sion (9.12.18) are of the form 

ag? x? a? 
cose l t (3 4+ 0 + Maen 


Also obtain this form from the result of Prob. 51, with u = x, 
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e nt raaee the leading terms of an expansion of sin z analogous to that obtained in 
rob. 52, by taking F(x) = (sin x)/xz, G(x) = x?, and a = 0, and also b i 
result of Prob. 51. ; a aia 

54. The equation x — e-* = 0 possesses a real root between z+ = 0.5 and 0.6. 
— use of the fact that e~°-6 = 0.548812, determine that root to five decimal 
places. 


55. Determine the root of the equation z4 — 32 + 1 = 0 between x = 1.3 and 1.4 
to five decimal places. 




















CHAPTER 10 


NUMERICAL SOLUTION OF EQUATIONS 


















10.1. Introduction. This chapter summarizes a number of methods 
which are available for the numerical solution of sets of linear algebraic 
equations (§§10.2 to 10.7), nonlinear algebraic or transcendental equa- 
tions in general ($10.8), and nonlinear algebraic equations in particular 
(§§10.9 to 10.18). With only minor exceptions, the treatments are inde- 
pendent of the content of preceding chapters. 

10.2. Sets of Linear Equations. A brief summary of terminologies 
and of certain known results, relative to solutions of sets of linear alge-— 
braic equations, is presented in this section. 

We suppose first that we are concerned with a set of n equations relat- 


ing n unknowns 2, T2, . - - , Un; and expressed in the form 


G11 + Arete + ++ * + Gintn = C1, 
2101 + Art, + + * * + Aantn = Cr, (10.2.1) 
Gni®1 + Ant, + * + * + Onn®n = Cry 


where the n? coefficients a,; and the n right-hand members Ci are pre- 
scribed. Here a;; represents the coefficient of x; in the ith equation of the 


set. ; 
The left-hand members may be specified by the square array of the 


coefficients, known as the coefficient matrix, 


Qir Giz * °° Gin 
Gat Oe ~ (10.2.2) 
Qni Qn2 °° * Onn 


whereas the complete set may be specified by the rectangular array, 


Qi1 Gig °° * Ain C1 
Goi Geo ++ * Gan Ce (10.2.8) 
Qni Gno2 °° ° Onn Cn 


known as the augmented matrix and formed by adjoining the column 


right-hand members to the n columns (10.2.2). 
424 
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The minor of any element a;; in the coefficient matrix is defined as the 
value of the determinant} of the square array obtained by deleting the 
ith row and the jth column of the coefficient array. The cofactor of a;;, 
to be denoted here by A;;, is defined as the result of changing the sign of 
the minor7 + 7 times. It is then true that, if each element of any column 
of the coefficient matrix is multiplied by its cofactor, then the sum of these n 
products 1s the value of the determinant of that matrix. Furthermore, if 
each element of any column is multiplied by the cofactor of the corresponding 
element of any other column, then the sum of these n products is zero. (Both 
statements also remain true if the word column is replaced by row 
throughout.) 

These facts permit the direct elimination of all unknowns except an 
arbitrarily chosen one, say x;, from (10.2.1). For if we multiply each 
equation by the cofactor of the coefficient of a, in that equation and add 
the results, they lead immediately to the consequence that the result is 
of the form 

Day; = aAn + CoA og + press + CnAnk, (10.2.4) 


where D represents the value of the determinant of the coefficient matrix. 
Thus, if D # 0, and if (10.2.1) possesses a solution, then that solution is 
unique, and each a, (k = 1,2, . . . , n) is obtained from (10.2.4) by divi- 
sion by D. 

It is then easily shown, by direct substitution, that the x’s so obtained 
actually do satisfy (10.2.1). For since (10.2.1) is of the form 


n 


> Ante = Cy 


k=1 


(Gi =1,2,...,n), 


and since (10.2.4) is expressible in the form 


Dx, = >, Aves (k= 1,2,...,m), 
j=l 


the result of substituting (10.2.4) into the left-hand member of the 7th 
equation of (10.2.1) is 


1 1 
D » Gir > Ane; = D »; 6) au An) Cj. 
k=l 0 j=l j=l 1 


ku 


Since the inner sum in the second form is the sum of the products of the 
elements in the 7th row of the coefficient matrix and the cofactors of corre- 
sponding elements in the jth row, it is equal to D when j = 7 and is zero 
when j # 7%, so that this quantity properly reduces to ¢;. 


| The definition and elementary properties of determinants are assumed, 
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Since the right-hand member of (10.2.4) would reduce to D if ¢1, ¢2, 
... €n were replaced by Giz, Gan, .- + 5 Gnk, it follows also that the 
right-hand member of (10.2.4) ts the value of the determinant of the matrix 
obtained from the coefficient matrix by replacing the column of coefficients of 
a, by the column of right-hand members of (10.2.1). If we denote the value 
of this determinant by D,, the solution of (10.2.1) can be written in the 


simple form 


eee ee ee 


“UR = D (10.2.5) 


d n), 
if D ~0. This result is known as Cramer’s rule. 
It is convenient to write 
Ai 


D = As, 


(10.2.6) 
when D = 0, and to speak of this ratio as the reduced cofactor of a;; in the 
matrix (10.2.2). With this notation, the result of writing out (10.2.4) 
fork = 1,2,..., nis of the form 


a. = Ayer + Anest +++ + Ant€ny 


tg = Aci + Anco + ines + An2tn; (10.2.7) 


Ln = A inci + A ante > + y, Gar on 


This set of relations is thus the result of ‘‘inverting”’ the relations (10.2.1), 


when D + 0. 
The array of coefficients of the right-hand members in (10.2.7), 


Au An aire Am 7 
(10.2.8) 


may thus be called the inverse of the coefficient matrix of (10.2.1) in that 
case, in the sense that whereas the array (10.2.2) specifies a transforma 
tion of the 2’s into the c’s, the array (10.2.8) specifies the inverse trans 
formation of the c’s back into the 2’s. 

We notice that the inverse of (10.2.2) can be obtained by first replacing 
each element of (10.2.2) by its reduced cofactor (cofactor divided by D), 
and then interchanging rows and columns. 

In order to describe the situation in the case D = 0, as well as the case 
in which there are m equations relating n unknowns, where m # n, it 
desirable to define the rank of any rectangular matrix with, say, m ro 
and n columns. From any such array, we may form a number of 8 
subarrays, by deleting certain rows and/or columns, and compressing t 
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remaining elements into a compact arrangement. The largest such sub- 
arrays would be of an order equal to the smaller of the integers m and n; 
the smallest would be of order one and would consist of only a single 
element. The rank of the given matrix is defined as the order of the 
largest such subarray whose determinant 7s not zero. 

The basic theorem relevant to the existence of a solution of a set of m 
equations in m unknowns states simply that the system is solvable if and 
only if the rank of the coefficient matrix is equal to the rank of the augmented 
matrix. 

Suppose that the ranks are equal, and let the common rank ber. Then 
r is not greater than the smaller of m and n. Now there exists at least 
one square r X r subarray in the coefficient matrix whose determinant is 
not zero. If one such subarray is found, and if r < m, then it can be 
proved that the m — r equations whose coefficients are not involved in that 
subarray are implied by the other m equations and can be suppressed. 
Ifr = n, then n equations remain and can be solved uniquely by Cramer’s 
rule. However, if r <n, then the n — r unknowns in the r remaining 
equations whose coefficients are not involved in the subarray can each be 
assigned completely arbitrary values, after which the remaining r un- 
knowns can be determined in terms of them by Cramer’s rule. Thus, 
the general solution of the system then involves n — r arbitrary con- 
stants, and the system is said to be of “defect”? n — 1, since it fails to 
determine a unique solution by permitting n — r degrees of freedom. 

The cases of most frequent practical interest are those represented by 
(10.2.1), in which m = n. In particular, if all the right-hand members 
c; are zeros, the set is said to be homogeneous. In this case, the coefficient 


’ matrix and augmented matrix are automatically of the same rank, so 


that the set is solvable. In fact, one solution is then always the trivial 
one, for which 4; = 4% = ++: =a2,=0. If D+ 0, this is the only 
possible solution. Usually this solution is of no interest, and the impor- 
tant homogeneous sets are those for which D = 0, so that the system 
admits nontrivial solutions. On the other hand, if at least one of the 
right-hand members is not zero, so that the trivial solution is not admis- 
sible, the interest centers mainly on the cases when a nontrivial solution 
exists and is unique, so that D ~ 0. Attention will be restricted prin- 
cipally to this last case in the following sections. 

Before proceeding, it may be pointed out that, whereas the preceding 
results are of basic theoretical importance and are essential to an under- 
standing of the nature of sets of linear equations, the use of Cramer’s 
rule in actual nwmerical cases is generally highly inefficient, because of the 
excessive labor involved in the evaluation of n +- 1 determinants of order 
n, when n is fairly large. Many other methods have been devised, certain 
of which are described in the following sections, 
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10.3. The Gauss Reduction. In principle, the simplest practical 
method of solving the set (10.2.1) is one due to Gauss. It consists in 
first dividing the first equation by a; and using the result to eliminate 
x, from all succeeding equations. Next, the modified second equation 
is divided by the coefficient of x, in that equation, and the result is used 
to eliminate x2 from the succeeding equations, and so forth. After this 
elimination has been effected n times, when D + 0, the resultant set, 
which is equivalent to the original one except for the effects of any round- 
offs committed, is of the form 


























Ly + Ai ols + Qi 3X3 + eee: + AinXn = C1, 
et A303 Ss i we Qn tn = Co, 

, pl 
Un-1 + An—1yntn — Cri 

in = Ch, 


(10.3.1) 


where aj; and cj designate specific numerical values, and the solution is 
completed by working backward from the last equation, to obtain suc- 
cessively 2n, m1, ..., 21. It is convenient to work with the aug- 
mented arrays at each stage, rather than to write out each equation in full. 

A renumbering of equations and/or variables will be necessary if, at any 
stage, the coefficient of a, in the kth equation is zero, and it is desirable if 
that coefficient is small relative to other coefficients in that equation, in 
order that the effects of round-off errors may be minimized. ' 

The exceptional cases in which D = 0 would evidence themselves 
through the fact that after r such eliminations, where r is the rank of the 
coefficient matrix, all coefficients in the n — r succeeding equations would 
vanish (except for the errors due to round-off). Unless all right-hand 
members of those equations also were reduced to zeros at that stage, the 
original set would be unsolvable. If all those members were zeros, the 
n — r equations would have been reduced to the form 0 = 0, and henee 
would be ignorable. The rth equation would express x, as the sum of @ 
specified constant and a certain linear combination of %41, . . » ,2n, and 
the process of back substitution would finally express x1, %2, . . . , % im 
similar forms. 

In illustration, we consider the three equations 


9.374621 + 3.041622 — 2.487123 = 9.2333, 

3.04162, + 6.183222 + 1.216823 = 8.2049, 

—2.437121 + 1.216822 + 8.442923 = 3.9339. 

The reduced equations, corresponding to (10.3.1), are obtained in the form 
£1 + 0.324452, — 0.25997x, = 0.98493, 


ay + 0.386242 = 1.00246, 
ay = 0.61448, 


(10.3.2) , 


(10.3.8 
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if five decimal places are retained, and the ‘back solution” yields the 
values 


% = 0.89643, 2 = 0.76512, x, = 0.61448. (10.3.4) 


A discussion of the reliability of these results is deferred to later sections. 

This method is known as the Gauss reduction. A modification, known 
as the Gauss-Jordan reduction, consists in using the kth equation, at the 
kth stage, to eliminate x, from the preceding equations as well as the 
following ones, so that the solution is obtained after n (or less) elimina- 
tions, and no back substitution is necessary. 

In practice, only the coefficients are recorded at the successive stages of 
the reduction, the array corresponding to the first stage intermediate 
between (10.3.2) and (10.3.3) thus being of the form 


1 0.32445 —0.25997 0.98493 
5.19635 2.00702 5.20914 
2.00702 7.80933 6.33427 


The necessity of recording new arrays at each of the intermediate stages 
is time-consuming and conducive to gross errors, particularly when many 
equations are involved. In the following section, a more efficient tech- 
nique is described. 

10.4. The Crout Reduction. A modification of the Gauss reduction 
which has the advantages that it is particularly well adapted to the use 
of desk calculators and of large-scale computers, and that the recording 
(or storage) of auxiliary data (such as the repeated rewriting of modified 
equations or arrays) is somewhat minimized, is due to Crout.f 

Starting with the augmented matrix M of the original system, 


Gi Giz + ++ Ain C1 
G21 Ao2 Gon iC 
M Qn 1 C2 ps ; 
ioe 8 Bie ie =Ais, (10.4.1) 
Ani One Toe SS Ann ' Cn 


which may be considered as being partitioned into the coefficient array 


A and the c column, one determines next the elements of an auxiliary 
matrix M’ of the same dimensions, 


/ 

G1 Ay» Qin | 

/ ] ‘ 

a - ss ah te 

M’ = 21 22 Ie AP sae 

ites re YE en ONE ER) (10.4.2) 
/ , , 1 / 

Qn Ane Gnn |} Cy 


t See reference [64], in which modifications which are co i 
7 : nvenient when the coefti- 
cients are complex are also given, Similar methods i 
: t 
Banachiewios, Turing, and Zurmihl, o Mepheontone Die Prepon 2 
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which may be considered as being partitioned, in the same way, into a 
square array A’ and ac’ column. From this matrix, one then obtains a 


i sey ae 
solution column x whose elements are the required values of «1, . , om 


v1 
ve 


x= (10.4.3) 


Ln 


Each entry in (10.4.2) and (10.4.3) is obtained from previously cal- 

culated data by a continuous sequence of operations, which can be 
i i i diate data. 
effected without the tabulation of intermed _ 

In order to describe the reduction in a simple way, it is convenient to 
introduce two definitions. First, the diagonal elements (or ous 
the principal diagonal) of a matrix are those elements whose row and co 
umn indices are equal, and which are underlined in (10.4.1) and (10.4.2) 
Second, the inner product of a row and a column, each conte 
plsmeets is defined as the sum of the n products of era e 
ments the elements of a row being ordered from left to right, and t 

’ 
lements of a column from head to foot. 7 
: "The n elements of the first column of the auxiliary matrix o 
determined first, then the remaining n of the n + 1 elements of the as 
row. Next, the remaining n — 1 elements of the second column an 
the pecond Tow are determined, then the remaining 2 — 2 ‘nan 
the third column and third row, and the process is continued unt: 
array is filled. 

The elements of the first column of M’ are identical with the % 
sponding elements of M; the remaining elements of the first ay a 
(to the right of the diagonal element a1) are each obtained ae a 7 
the corresponding element of M by the diagonal element a4. us, 

Uy 
le a, = a1 ay = Gor and Ain = 42/041. ; 

gos this stage aawans Wie elements of M’ are calculated, in the or 
ified above, according to two rules: ; 
Let element on or below the principal diagonal in M’ is obtain 
by aabtracking from the corresponding element in M the inner prow 

its own column and its own row in the square subarray A’, wit 

calculated elements imagined to be zeTOs. ' 
“4 Each element to the right of the principal diagonal in a is @ 
culated by the same procedure, followed by a division by the diag 

Jement in its row of M’. ;' 

5 Finally, the elements of the solution column x are determined in 
? 


order 2n, %n—1) + + + » Xa, 1, from foot to head, The element 2» is i 
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tical with c,. Each succeeding element above it is obtained as the result 
of subtracting from the corresponding element of the c’ column the inner 
product of its row in A’ and the x column, with all uncalculated elements 
of the x column imagined to be zeros. 

The preceding instructions are summarized by the equations 


j-l 
Aig = Ay — Oi. Ah; G27), (10.4.4) 
k=1 
i-1 
a; = = as as aia, (i <9), (10.4.5) 
ai ea 
i-1 
/ 1 Til 
oe « a » ate} (10.4.6) 
Ww koi 
n 
and x= ci — Ay Lk, (10.4.7) 


k=i+1 

where 7 and j range from 1 to n when not otherwise restricted.t It is 
seen that the process defined by (10.4.7) is identical with the “back 
solution” of the Gauss reduction, which determines Oi 24 
(10.3.1). 

In the important cases when the coefficient array A is symmetric, so 
that each element a;; in A above the principal diagonal is identical with 
the symmetrically placed element a; below the diagonal (a;; = a,:), as in 
the system (10.3.2), it can be shown that each element aj, in A’ above the 
principal diagonal is given by the result of dwiding the symmetrically 
placed element aj; below the diagonal by the diagonal element a/,. This 
fact leads to a considerable reduction in labor in such cases, particularly 
when n is large, since then each element below the diagonal thus can be 
recorded as the dividend involved in the calculation of the symmetrically 
placed element, before the required division by the diagonal element is 
effected. 

It can be shown that the elements to the right of the diagonal in M’ 
are identical with the elements which appear in corresponding positions 
in the augmented matrix of (10.3.1), obtained by the Gauss reduction. 
The compactness of the tabulation is a consequence of the fact that all 
necessary intermediate data are tabulated in the remaining spaces, which 
would normally be occupied by 1’s and zeros, 

The kth diagonal element aj, is the number by which the kth equation 
would be divided, in the Gauss reduction, before that equation is used to 


. , & from 


| The mathematical derivation of these relations is included in Appendix A, 
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eliminate x, from succeeding equations. In consequence of this fact, it 
is true that the value of the cart the original coefficient matrix 1s 
e diagonal elements of A’. =! 
sas ela nh her against calculational errors is afforded by adj oin- 
ing to the columns of M an additional column, each of whose elements is 
the sum of the elements in the corresponding row of M. If this column is 
treated in the same way as the c column, corresponding check columns are 
obtained and adjoined to M’ and x. The check consists in the fact that 
each element in the M’ check column should exceed by unity the sum of the 
elements in its row of M’ which lie to the right of the diagonal element, 
whereas each element in the x check column should exceed by unity the 
corresponding element in the x column itself. _The sudden appeal 
of an appreciable discrepancy will generally indicate the commission oI a@ 
culational error. ; 
or discrepancies may correspond to the effects of ae 
round-offs, effected in the steps of the reduction, and can be sae y 
retaining additional significant figures in that calculation, or by an alter- 
native procedure to be described at the end of this section. ae 
appreciable loss of accuracy may occur if a diagonal element w if a 
small, relative to elements to its right, appears at an early stage o the 
reduction. Such a situation usually can be remedied by renun haa 
the equations, and it frequently may be avoided by initially ordering t 7 
equations in such a way that the coefficient of %, in the kth equation is 
as large as possible relative to the other coefficients in that equation 
For the set of Eqs. (10.3.2), the complete tabulation consists of the 
given matrix (and check column, if desired), 


4 

9.3746 3.0416 —2.4871 | 9.2333 | 19.212 
3.0416 6.1832 1.2163 ; 8.2049 18.6460 (10.4.8) 

—2.4871 1.2168 8.4429 | 3.9339 | 11.1560 


the auxiliary matrix (and check column), 


494.1 

9.3746 0.32445 —0.25997 | 0.98493 | 2.0 
3.0416 5.19635 0.38624 | 1.00246 | 2.38870 (10.4.9) 
—2.4371 2.00702 7.03414 | 0.61448 | 1.61448 


and the solution column (and check column), 


0.89643 | 1.89643 
0.76512 | 1.76512 (10.4.10) 
0.61448 | 1.61448 








if five decimal places are retained in the calculations. 


+ It may be noticed that the set (10.3.2) is ordered in this way, 
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Here the computational-error check, based on the check columns, 
displays no discrepancies through the fifth decimal place. This fact, 
however, does not guarantee that the results are correct to five decimal 
places. Indeed, the relationship between the size of discrepancies indi- 
cated by the check column and the magnitude of the effects of inter. 
mediate round-offs is not a simple one. The analysis of the effects of 
round-off errors is complicated, when many equations are involved, and 
cannot be treated here (see von Neumann and Goldstine [166]). 

If the calculated values are substituted into the left-hand members of 
the original equations, the presence of deviations between the resultant 
members and the original right-hand members serves to indicate the pres- 
ence of errors due to intermediate round-off. But, again, the relationship 
between the magnitudes of these deviations and the magnitudes of the 
errors in the solution column is not a simple one. 

However, we may notice that, if substitution of the calculated values 


Zi, Z2, ... , Z, yields the right-hand members @,, @, . . . , &, so that 
QE, + Ay%. + ++ + + An, = C1, 
AniE1 + Ano®e + + + + + AnnEn = En, 


whereas the true values are to satisfy the equations 


Qk + 1202 “+ i Gintn = C1, 


AniX + AnoXe + ae <6 oa Qantn = Cn, 


there follows, by subtraction, 


Qi1 621 + Gry 642 + > - + Qin 62n = 6¢1, 
Brads shade ACAD EN OC op hehe giusp Na cat le CAL eee ch (10.4.11) 
Gni 621 + Ang 6X2 + +--+ + Ann 62, = 6Cn, 
where 62, = TE — Try 6c, = ch — Cx. (10.4.12) 
Thus the necessary corrections 621, . . . , 6x, satisfy a set of equations 


which differs from the original set only in that each c; is replaced by the 
residual cy — é. 

If this set could be solved without round-off, the corrections would 
thus be obtained exactly. But, since this situation generally will not 
exist, the corrections obtained are themselves approximate. New resid- 
uals can then be calculated, and the process can be iterated, if so desired. 
Fach such calculation is particularly simple in the Crout procedure, since 
only the ¢ column of M, the c’ column of M’, and the solution column 
need be recalculated, all other data being unchanged. 

Whereas there is no certainty that the iteration will converge rapidly 
(or at all), the method is usually an efficient one, and, by successive itera- 
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tion, carrying along more and more significant figures only in the cal- 
culation of the residuals, it is usually possible to stabilize a large number 
of significant figures without undue labor. Clearly, the same increase in 
accuracy could be obtained alternatively, but generally with an increase 
in labor, by repeating the entire calculation with retention of additional 
significant figures. 

In the case of the preceding example, the residuals corresponding to 
the approximate solution (10.4.10) are found to be 


ic, = — 1.2462 X 10-5, 6c. = 3.6504 XxX 10-, 6c3 = —1.9095 X 10-> 


and the approximate corrections are found to be 


da. = 0.98893 X 10-6, 


82, = —0.50421 X 10°, 
bz, = —0.54016 X 10~, 


if five significant figures are retained, yielding the improved values 
x1 = 0.8964240579, xo = 0.7651298893, x3 = 0.6144745984. 


The new residuals are found to be of the order of 10-2°, and anoth 
iteration would supply 14-place accuracy, the rounded 10-place valu 
agreeing with those given above except for a one-unit change in the tenth 
digit of «1. 

If the coefficients and right-hand members of the original set of equ 
tions are only four-decimal-place approximations to true values, the p 
ceding retention of 10 or more decimal places may be expected to 
foolish, since it is useless to strive for a higher degree of accuracy t 
that which is compatible with errors inherent in the given system. T 
problem is to be considered explicitly in $10.6. 

10.5. Determination of the Inverse Matrix. From (10.2.7) a 
(10.2.8), it follows that the kth column of the matrix (10.2.8), which 
the inverse of the coefficient matrix (10.2.2), is the solution column co 
sponding to the result of setting c, = 1 and all other c’s equal to zero 
(10.2.1). Thus if, in place of the single ¢ column in (10.4.1), we insel 
the square array 


10 - 0 
OF nasi (10.4 
0 0 1 


of n columns, and treat each column of this array as a c column, we 
obtain finally the array (10.2.8) in place of the single x column. That 
the resultant solution array will be the inverse of the coefficient ma 
of the given set of equations. A check column can be included, if 
desired, and the rules given for its use apply as stated, 
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The determination of this inverse matrix is particularly desirable when 
the set (10.2.1) is to be solved for many distinct sets of right-hand mem- 
bers, for (10.2.7) states that each x;,, satisfying (10.2.1), 7s given by the 
inner product of the c column and the kth row of the ‘riveree: matrix (10.2.8) 


In the case of the exam i 
ple previously treated, the auxili 
corresponding to the given array tae 


100 
0 1 0 
00 1 
is found to be 
0.106671 0 0 
—0.062438 0.192448 0 
0.054773 —0.054909 0.142164 


and the solution array is obtained in the form 


0.148032 —0.083594 0.054774 
—0.083594 0.213651 —0.054909 (10.5.2) 
0.054773 —0.054909 0.142164 


Here six decimal places were retained, in order that five significant 
figures would be afforded.t It may be noticed that the inverse matrix 
possesses the same symmetry as the given matrix. (The single discre 
ancy of one unit is due to round-off.) = 
The result obtained is equivalent to the statement that, apart from the 
effects of round-offs, the solution of the set (10.3.2) would be of the form 


X1 = 0.148032c, — 0.083594c2 + 0.054774c;, 
XL. = —0.083594c; + 0.213651c2 — 0.054909cs, 
£3 = 0.054773c, — 0.054909c2 + 0.142164cs, 


if the right-hand members of (10.3.2) were replaced by ci, ce, and c 

respectively. In particular, the substitution of the potiiak sou aged 
members into (10.5.8) again leads to (10.4.10). 

The elements of (10.5.2) are the reduced cofactors defined in (10.2.6) 

in accordance with (10.2.7). Since, as was stated in §10.4, the distor 


aaah of the given matrix is the product of the diagonal elements of 
eae» 


(10.5.3) 


D = (9.3746) (5.19635) (7.03414) + 342.66, 


the array of the cofactors themselves i i 
8 is obtained by multiplicati 
and interchange of rows and columns. A ata 


| For purposes of illustrating the technique, we again ignore the fact that inherent 


errors, due to round-off i ] igni 
pb rooted ome in the given data, may adversely affect the significance of 
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10.6. Inherent Errors. In addition to the errors due to intermediate 
round-off, which generally can be detected and removed by the methods 
previously described, errors due to possible inaccuracies in the coefficients 
and right-hand members of the given equations themselves must be taken 
into account. 

In order to investigate these errors, we suppose that the set actually 


solved is 


(10.6.1) 


AniX1 +f Hogs ant + Anntin = Cn, 


whereas the true values of the coefficients are aj; + 6a; and the true values 
of the right-hand members are ¢; + 6c. If we denote the true values of 
the unknowns by 2; + 62, there follows also 


(dui + 641) (41 + 621) + navets 7 %s + (Qin + dain) (Xn + 52n) 

=¢, + 6c, 
his vss aes Bed Gye sb es ipsa bite RS ee oa, MME UPe BO RRR TOPLESS BS OS Sele 8 Soa (10.6.2 
(Ant + bdn1) (21 + 621) + ee ® + (Gnn + 5Ann) (Ln + 52n) 

= Cn + ben, 


and the result of subtracting Eqs. (10.6.1) from (10.6.2) is expressible i 
the form 


ay, 641 + +: + Ayn 6Xn = 61 — (x4 dau +: + Xn 5ain), 


Qn 641 + <>* sof Ann 5Xn = 6¢n — (a1 bani + ee a aes + an 5Ann), 


if products of errors, of the form (ai) (52%), are asswmed to be relativ 
negligible. 

Thus, if the errors 6a;; and 6c; were known, the solution errors 
would be obtained (to a degree of accuracy consistent with this assum 
tion) by solving a set of equations which differs from the set actuall 
solved only in that the right-hand member ¢; is to be replaced by i, whi 


n= 6c; = (a1 6041 + fe) ners + In bain). (10.6, 


However, in practice, it is usually known only that the errors baiy 
8c; do not exceed a certain positive number, say e, in magnitude, so t 


—e S ba; S«, —eS bc Se. (10,6, 


Hence, in such cases, we are certain only that 


|ni| s K, 
E = (1+ lay) + lao] + + + + [enlde. 


where 
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The solution of the set (10.6.3) can be expressed in the form 


bre = Aum + Aun + ---+ Ann (k=1,2,...,n), (10.6.8) 
with the notation of (10.2.6). Thus, if (10.6.6) is true, there follows 
|dax| S (JAu| + [Axel +--+ + |Anal)E. (10.6.9) 


The reduced cofactors involved in this expression are the elements of 
the kth row of the inverse of the coefficient matrix. Thus, if the inverse 
matrix is calculated, approximate upper bounds on the effects of inherent 
errors are obtainable from (10.6.9). They are not strictly upper bounds, 
since they were derived under the assumption that terms of the form 
daix 5x, are small in magnitude relative to E, as defined in (10.6.7). How- 
ever, unless the upper bounds predicted under this assumption are such 
that the truth of the assumption is contradicted, they may be accepted 
as close approximations to the true upper bounds (which could be 
attained, in any case, only when all the errors combined in the most 
unfavorable way). 

In the case of the preceding numerical example, if it is supposed that 
the coefficients and right-hand members are merely rounded approx- 
imations to true values, there follows H ~ 3.28 = 1.64 X 10-4, and 
reference to (10.5.2) yields the estimates 


|621|max ~ 0.29F = 0.48 X 10-4, |2r2|max ~~ 0.35E = 0.57 X 10-4, 
|823|max ~ 0.25H = 0.41 X 10-4. 


Thus, we could be confident only that the solution of the true equations 
is such that 


0.89637 < 21 < 0.89648, 0.76507 < a2 < 0.76519, 
0.61443 < 23 < 0.61452, 


so that we could write x, = 0.8964, x. = 0.7651, and x; = 0.6145, with 
the last digit in doubt by one unit in each case. 

Unless the given system of equations is to be solved for various sets 
of right-hand members, so that the determination of the inverse matrix is 
advisable in any case, an error estimate which does not involve the cal- 
culation of the elements of that matrix is desirable. It is clear that the 
upper bound (10.6.9) would correspond to the value of 6a, which satisfies 
a set of equations of the form 


Gy, 621. + + + Hain 6%, = +#, 
Ani 6x + Pte + Ann 6Xn, = +#, 
for some choice of the ambiguous signs of each of the n right-hand mem- 
bers, However, the manner in which the signs are to be associated with 
successive equations cannot be determined unless the signs of the relevant 
cofactors of the coefficients of da in these equations are known, Further- 


(10.6.10) 


(10.6.11) 
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more, a different combination of signs may be needed to maximize each 
of the é2’s. 

Reference to the Crout reduction shows that each of the elements in 
the solution column is obtained as a linear combination of the elements 
of the c’ column of the auxiliary matrix, each of which is, in turn, a linear 
combination of the elements of the c column (that is, of the original right- 
hand members). Thus, if each entry in the c column were +2, and if, 
in the calculation of each element of the c’ and x columns, we were to 
replace all subtractions by additions, it follows that no element of the 
resultant x column could be exceeded in magnitude by a corresponding 
element obtained by solving (10.6.11) with any prescribed combination of 
signs. i 
Hence, if an additional column with unity as each element is adjoined to 
the matrix M and is transformed just as the c column except for the fact 
that all subtractions are replaced by additions, the result of multiplyi 
by E the elements of the final corresponding column, adjoined to t 
solution column, gives (approximate) upper bounds on the possible erro 
in the corresponding elements of the solution column, due to possibli 
errors in the coefficients and right-hand members of the given equatio 
(this procedure appears to be due to Milne [10]). The bounds obtain 
in this way usually exceed the more precise bounds afforded by (10.6.9) 
but are obtained much more simply. 

In the case of the illustrative example, the inherent-error check col 
umns adjoined to the given, auxiliary, and final arrays are found to be 


1 0. 10667 0.28640 
1 0.25488 0.35215 
1 0.25185 0.25185 


Here it happens that the upper bounds, obtained by multiplying th 
successive elements of the iast column by E, are identical with th 
afforded by the preceding analysis. 

The fact that this situation is not a general one may be illustrated, f 
example, by the case when only two equations are involved. Here, t 
estimates afforded by (10.6.9) are 


a \axi| + |ae1| 
| 522| max ~ |D| E, 





|6201|max ae eal a E, 


whereas it is readily verified that the estimates afforded by the simph 
procedure are 





‘a |a11022 tai 12021| + |ai2a21| + \ay142\ R 
lan| | DI 
aus) + |a2il 
ID] EK. 


)82rql evnx re 


|521| max 
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Whereas the estimates for |5%2|max are identical, it is seen that the latter 
estimate for |5x1|max exceeds the former estimate except in special cases 
such as that in which ai1@22 > @12G@21 > 0. 

Systems in which small errors in the coefficients or right-hand members 
may correspond to large errors in the solution are often said to be ill- 
conditioned systems and are essentially characterized by the fact that the 
determinant of the coefficient matrix is small, in magnitude, relative to 
certain of the cofactors of elements of that matrix.j When such a sys- 
tem is encountered, one must either make the inherent errors small by 
retaining a large number of significant figures in the given data, when this 
is possible, or merely accept the fact that inaccuracies in the given data 
then permit the solution to be determined only within relatively wide 
error limits. 

10.7. Gauss-Seidel Iteration and Relaxation. In many sets of linear 
equations which arise in practice, the equations can be ordered in such a 
way that the coefficient of 2, in the kth equation is large in magnitude 
relative to all other coefficients in that equation. Such sets are often 
ee to an iterative process in which the set is first rewritten in the 
orm. 


1 
21 = — (C1 — Aik. — Arslg — 2 ss 
; aa 1 1202 1303 A1n¥n), 
2 } (cp — a ) 
2 = — (Co — Ai1%1 — Aogl3 — + + + — Aan®n 
A22 ; (10.7.1) 
_ il 
Tr =o (Cn — GniZi — Anke — +--+ — Ann—12n—1) 
The initial approximation is taken to be 
C1 C2 c 
= , gf) = —, oP gO = —. 
a Gai ’ i rs (10.7.2) 


The next approximation is obtained by replacing the unknowns in the 
right-hand members of (10.7.1) by these initial approximations, and the 
feed-back process is to be repeated until the input and output of a cycle 
agree within the specified tolerance. The iteration may or may not 
converge. 

In the case of the system (10.3.2), about 10 such iterations are required 
for three-place accuracy. If the iteration is modified in such a way that 
each unknown in each right-hand member is replaced by its most recently 


+ More specific measures of the ‘‘condition” of a matrix have been proposed by 
von Neumann and Goldstine {166], Turing [227], and others. Whereas these measures 
are of marked theoretical importance, their usefulness in explicit practical cases is 
limited by the amount of computation involved in their evaluation, 
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calculated approximation, rather than by the approximation afforded by 
the preceding cycle, the rate of convergence depends upon the order in 
which the x’s are modified. In particular, if they are modified in their 
natural order, the number of cycles required to afford three-place accuracy 
is reduced to about six. The latter procedure is often called Gauss- 
Seidel iteration, although its attribution to either Gauss or Seidel appears 
to be improper. 

In the frequently occurring cases when the coefficient matrix is sym- 
metric, so that a;;= a,j, the iteration will converge if and only if all the 
n quantities 











Qi. G12 13 Qiu *** Gin 
Qi. 12 
au, ; Q12 Ace Q23 pee PE ee | Rae a) wes as eee 
Qi2 Ane 
A213 G23 33 Qin +: * Gnn 


are positive (or become positive when the signs of all a’s are changed) 
(see Reich [191]). 

There exist many other numerical techniques for solving sets of line 
equations, some of which are reductions which would yield the exa 
solution of a set of n equations in n unknowns after a finite number 
steps if no round-offs were effected (as is true for the Gauss, Gaus 
Jordan, and Crout reductions), and others which are basically iterativ: 
in the sense that generally an infinite number of steps would be requir 
to afford exactness in the absence of round-off (as in the Gauss-Seid 
process). 

Particular mention should be made of the method of conjugate gradien 
due to Hestenes and Stiefel, which would terminate in the absence 
round-off, and of a rather extensive class of other “gradient methods 
and related methods. 

On the other extreme, there exist the so-called ‘‘relaxation methods, 
apparently invented by Gauss, and revived and popularized by Sou 
well, in which the rapidity (or existence) of convergence depends upon thi 
ingenuity of the computer. Because of their flexibility, they are proces 
which do not lend themselves to mechanization, but which are usef 
for hand computation in many situations. 

In applying the relaxation procedure to the solution of a set of equatio 
of the form 

Qi1%1 + Ai%e2 ++ -: 


OniL1 + Ante + + + + + Ann¥n = Cn, 


we first define residuals Ri, Ro, . . . , Rn by the equations 


C1 — Qy1%1 — Arteta — + + + — Aintn = Ri, 


Cn — AniX, — Anelg — + + + — Anntn Ra, 
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The unknowns 21, %2, . . . , %, are then estimated, and the corresponding 
residuals are calculated, after which the estimated values of the unknowns 
are to be successively modified (one or more at a time) in such a way that 
the magnitudes of all residuals are eventually reduced effectively to zero. 

Reference to (10.7.4) shows that when 2; is increased by unity, and all 
other x’s are held fixed, R; decreases by a;;. Thus the transpose of the 
coefficient matrix, 


G11 Gai Asi Gni 
Giz G22 Ase Ang 
sd coh aie Hod Rode ce (10.7.5) 
Qin Aeon Q3n Ann 


in which the rows and columns of the original matrix are interchanged, 
serves as a relaxation table, in the sense that the successive entries in the 
kth row of (10.7.5) represent the decreases in the successive residuals which 
correspond to a unit increase in 2, If the original coefficient matrix is 
symmetrical, the matrix (10.7.5) is then identical with it. 

The situations which generally are most favorable to this process are 
those in which each of the diagonal elements ai, doz, . . . , Gnn is large 
in magnitude relative to the other elements in its row and column. For, 
in such cases, an increase in x; of about R;/ax, will nearly reduce R, to 
zero, but will affect the other residuals by relatively small amounts, and 
subsequent modifications of other x’s generally will not seriously nullify 
the effect of this reduction. 

It may be seen that the so-called Gauss-Seidel iteration consists in 
determining successive corrections in exactly this way, since it defines the 
modified x, say xj, by the equation 


1 
—_ (ce — Qpi%i1 — + + = AknXn) 


Okk 


1 R 
rn (Re + Anete) = te + a 


x, — AK, k-1Uk—-1 — Ak,k+1%k41 — + °° 


The advantages associated with the relaxation process follow from the 
fact that the values of the residuals are known at each stage. Thus it is 
possible to focus attention at each stage on the residual of largest mag- 
nitude and either to reduce its magnitude effectively to zero or proceed 
otherwise if an alternative procedure appears to be desirable, rather than 
to modify the x’s successively in a preassigned cyclic order as in the Gauss- 
Seidel iteration. At the same time, the fact that an efficient use of the 
process requires a decision after each step not only places a premium on 
the ingenuity of the computer, but also makes the procedure poorly 
suited to large-scale programed caleulators, 
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A typical sequence of relaxations, as applied to the result of first 
rounding all numerical coefficients in (10.3.2) to three digits, is included 
for purposes of illustration. 

















9.37 3.04 —2,44 
3.04 6.18 1.22 
—2.44 1.22 8.44 
Q® © 
Ax, Ax Ax3 Ry Re R; 
0 0 0 9.23 8.20 3.93 
1 —0.14 5.16 6.37 
1 2.30 3.94 —2.07 
i —0.74 —2.24 —3.29 
7.40 22.40 32.90 x 107% 
—4 | —17.16 —17.52 0.86 
-3 —8.04 1.02 4.52 
-1 1.33 4.06 2.08 
1 —1.71 —2.12 0.86 
- ; —17.10 —21.20 8.60 X 10-? 
—4 —4.94 3.52 13.48 
2 —0.06 1.08 -—3.40 
—0.60 10.80 —34.00 xX 107% 
—4 | —10.36 15.68 —0.24 
3 —19.48  —2.86 —3.90 
—2 —0.74 3.22  —8.78 
—1 —3.18 4.44  —0.34 
1 —6.22 —-1.74 —1.56 
—1 3.15 1.30 —4.00 





ee  sa>s0909090 


The residual table is written down immediately, and columns are pro: 
vided for successive changes in the estimated unknowns and for the su 
cessive values of the three residuals. Values of the diagonal elemen 
rounded to the nearest integer, are encircled above the correspondi 
residuals, for convenience in estimating appropriate changes in the x's, 

Starting arbitrarily with the crude approximation #1 = % = %s 9 
the initial residuals are then merely the right-hand members of the giv 
equations and are listed in the first row of the calculation. Since t 
largest residual at this stage is Ri = 9.23, we increase x; by the inte 
nearest R,/ai1 ~ R:/9, and so enter a unit in the Ar, column and su 
tract unity times the first row of the relaxation table from the row 
residuals. At this stage R; is largest in magnitude, and ay is iner 
by 1 ~ 6.37/8, after which a unit increase in 2% is called for. At 
stage, each residual is less than one-half the corresponding rounded 
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onal coefficient, and it is convenient to multiply the residuals by a factor 
of 10. The corresponding changes in the 2’s accordingly are then to be 
divided by 10 when all these changes are eventually accumulated. 

The approximate solution at the last stage of the tabulation given is 
x1 = 0.897, x2 = 0.764, and x; = 0.615. It may be noticed that, with 
this arrangement of the calculations, the entries in the relaxation table 
need only be multiplied by integers. Also, it is possible to avoid all inter- 
mediate round-off without carrying more decimal places than are involved 
in the given data. In particular, the residuals corresponding to the three- 
place approximations obtained at the last stage given would be exactly 
0.00315, 0.00130, and —0.00400 if the given three-digit data were exact. 

However, it is desirable to accumulate the increments in the x’s, from 
time to time, and to calculate the corresponding residuals directly, in 
order to avoid the propagation of the effects of gross errors. 

Relaxation methods are particularly useful when a fairly large number 
of equations is involved and when only the diagonal elements and their 
neighbors in the coefficient matrix differ from zero, that is, when the kth 
equation involves, say, only 2,1, x, and 241, as in Eqs. (6.17.17). 

10.8. Iterative Methods for Solving Nonlinear Equations. Most of 
the useful methods for obtaining an approximate real solution of a real 
equation, of the form 


f(x) = 0, (10.8.1) 


involve iterative processes in which an initial approximation zo to a desired 
real root x = a is obtained, by rough graphical methods or otherwise, 
and a certain recurrence relation is used to generate a sequence of succes- 


sive approximations 21, ze, . . 


. Zn) . - . Which converges (in a certain 


' associated class of cases) to the limit a. 


One such method is that of successive substitutions, in which (10.8.1) 
is first rewritten in an equivalent form 


x = F(z), (10.8.2) 
and the recurrence relation is of the simple form 


Generally there are many convenient ways of rewriting (10.8.1) in the 
form (10.8.2), and the convergence or divergence of the sequence of 
approximations may depend upon the particular form chosen. 

In order to see why this is so, we may notice first that, since F(a) = a, 


(10.8.3) implies the relation 

@ — Bry. = F(a) — F(ex) = (a — z)F'(&), (10.8.4) 
where & lies between zg and a, under the assumption that /'(x) possesses 
a continuous derivative over that range, If the iteration converges, so 











444 INTRODUCTION TO NUMERICAL ANALYSIS 

































that 2, a as k— ~, then, for sufficiently large k, we must have 
F'(&) ~ F’(a), and hence 


a— 2 ~ A[F’(a)}*, (10.8.5) 


where A is a constant, and this deviation tends to zero as k increases only 
if |F’(a)| < 1. Thus it appears that, in order that the iteration converge 
to z = a, it is necessary that |F’(x)| < 1 in the neighborhood of z = a. 

If we here define the convergence factor p, as the ratio of the error in 
ze41 to the error in zz, it follows that if 2 is near a, then p, ~ F'(a). 
Unless |F’(«)| < 1, a small error in z; is increased in magnitude by the 
iteration, and we say that the iteration is then asymptotically unstable at a. 
The number F’(a) may be called the asymptotic convergence factor. 

If the initial approximation is sufficiently near a, and if the iteration is 
asymptotically stable at a [so that |F’(a)| < 1], the sequence of iterates 
will indeed converge to a, in such a way that ultimately the successive 
approximations tend toward a from one direction if 0 < F’(e) < 1, and 
oscillate about a with decreasing amplitude if —1 < F’(a) < 0. 

In the special cases when F’(a) = 0, the nature of the convergene 
depends upon the behavior of the higher derivatives of F(x) near x = a, 

In illustration, a rough plot of the function y = 2* — « — 1 shows tha 
the real root of the equation 


v—x—-1=0 


(10.8.6 


is between x = 1 and x = 2, andis near z = 1.3. This equation can 
conveniently written in the form (10.8.2) in various ways, such 
2=2?—1, 2 =1/(x?—1), and «= (c+ 1). However, only t 
third (and least convenient) of these particular forms is such that t 
derivative of the right-hand member is smaller than unity in absolw 
value near x = 1.3. Hence, we may use the recurrence formula 


ek+1 = (Zr + 1)3 (10.8. 
and, with zo = 1.3, obtain the sequence 2, = 1.3200, 22 = 1.3238, 23 
1.3245, 24 + 25 = 1.3247, when four decimal places are retained. T 
true value is 1.8247179573, to 10 places. ; 

In view of (10.8.5), we may notice that, if the iteration converges, am 

if F’(a) ~ 0, the approximation 

a et Ap 
will be valid for some constants A and 8, independent of k, when k 
sufficiently large. If we rewrite this relation with k replaced by k + _ 
and by k + 2, and eliminate the unknown A and @ from the result) 
three relations, we may deduce the approximation 


Q— 22 a Sk+1, 


A= 2+ a— && 
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which yields the estimate 
PB he ke cael 
Gis, oA 
Sep2 — 2rp1 + Ze 
or, equivalently, 
(Zep2 — Seti)? _ ¥ (Azes1)” 
= &k+2 — yo, 
2n42 — 2eey1 + 2% A°2K 





(10.8.8) 


a > 242 — 
where 


Az, = Zk41 — 2ky A?z~ = Aznyi — Ate = 2ep2 — Qenz1 + ee 


Thus, if three successive iterates z:, 2x41, and 2x42 are known, this rela- 
tion affords an extrapolation which may be expected to provide an 
improved estimate of a, when the iteration converges. This procedure 
is often called Aztken’s 6? process. In the preceding example, with 
zs = 1.3245, Az. = 0.0007, and A’z, = —0.0031, to four places, (10.8.8) 
yields the extrapolation a ~ 1.3245 + 0.0002 = 1.3247, which happens 
to agree with z, to four places and is correct to those four places. If 
additional digits had been retained in the calculation of the iterates 21, 
za, and z3, even though those digits were not of apparent importance to 
the iterates themselves, the approximate value of « obtained from them 
by an extrapolation based on (10.8.8) would have been found to be correct 
to additional places. 

In a wide class of related methods for dealing with (10.8.1), a recurrence 
formula of the type 

Zey1 = 2 — fe) (10.8.9) 

Yk 

is used, with a suitable definition of the auxiliary sequence Yo, ¥1, - - - , 
Yu... The relation (10.8.3) can be specialized to (10.8.9) by writing 
F(x) = 2 — $(x)f(x), where ¢(z) is a function such that $(z) = 1/ye. 
It should be noticed, however, that the function F(x) — x relevant to 
the method of successive substitutions is not necessarily proportional to 
f(x), but is required only to be a function which vanishes at the required 
point a for which f vanishes. Conversely, the explicit definition of a 
function (x) which takes on the chosen value 1/7, when x = z, obviously 
is not necessary in the present case. 

It is clear that if the z sequence converges, so that 2.41 — 2, > 0, and 
if y, remains finite, there then follows f(z.) > 0, so that z, tends to a solu- 
tion of (10.8.1). In particular, the requirement z.41 = a, where f(a) = 0, 
would imply that 
_0= sa), 


ear (10.8.10) 


Vk 


so that y, would then represent the slope of the secant line joining the 
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points P.(z:,fz) and P(a,0) in Fig. 10.1. Thus it is desirable to define the 
y sequence in such a way that this situation is approximated at each stage 


of the calculation. 
In the method of “false position” (regula falst), the iteration is initiated 





Fie. 10.1 





Fie. 10.2 


by finding zp and 2, such that fo and f; are of opposite signs and by defin- 
ing y1 as the slope of the secant PoP, (Fig. 10.2), so that 


aaa Boy, (10.8.1) 





In each following iteration, yz is taken as the slope of the line joining 
P, and the most recently determined point at which the ordinate differs 


in sign from that at Px. 


The procedure is seen to be merely iterated 


linear inverse interpolation and is clearly 
the rate of convergence may be slow. 


certain to converge, although 
In the case of (10.8.6), with 
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zo = 1.3 and z; = 1.4, the next three iterates may be found as follows: 





2k Se 1/y —fr/ ve 
1.3 —0.103 — — 
1.4 0.344 0.224 —0.077 
1.323 —0.00731 0.219 0.0016 
1.3246 —0.000503 0.219 0.000110 
1.324710 


As this example illustrates, the factor y, often changes slowly after. 
the first few steps, and the rate of convergence then is only slightly 
reduced if, from such a stage onward, ¥; is assigned a constant value. 

In this illustration, the approximation z was obtained by interpolation 
based on 23 and 21, in accordance with the preceding description of the 
procedure. If, instead, the last two abscissas available are used, so that 
here z, is obtained by extrapolation based on zs and 22, with 1/ys + 0.2385, 
a better approximation (1.324718) is obtained. More generally, whereas 
the systematic use of the slope of the secant P,_1P; cannot be guaranteed 
to yield a convergent sequence when it requires extrapolation, this mod- 
ified procedure is usually advantageous when it does converge, and its 
use near the end of the calculation is often desirable. 

It is of importance to notice that, since f (a) = 0, the general recurrence 
relation (10.8.9) implies the relation 
_ fla) — Ses) 


Qa — 21 = aA — Be 
Yk 


ll 


(a — 2k) [: = 1 wey, (10.8.12) 


where & is between z, and a. Thus the convergence factor px at the kth 
stage is given, to a first approximation, by 1 — [f’ (a) /vi] when 2, is near 
a, and, unless this factor is smaller than unity in magnitude, so that 


0< re <2 (10.8.13) 


when k is large, convergence of z, to a generally cannot be obtained. 

An important method, known as the M ewton-Raphson method, con- 
sists in taking y, in (10.8.9) as the slope of the curve y = f(x) at the point 
z, (Fig. 10.3), so that (10.8.9) becomes 





any. = 2 — ne (10.8.14) 


This iteration is seen to be the special case of (10.8.3) in which 


F(x) =a - Fey 
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and hence F’(x) = f(a)f"(@)/[f'(@?. Thus, if f(a) #0 and f’’(a) is 
finite, there follows F’(a) = 0, so that the convergence factor tends to 


zero when and if 2, — a. 
In order to examine the behavior of the error a — 2, we rewrite 


y 


2% Zh+1 


& 


Fie. 10.3 
(10.8.14) in the equivalent form 
a — Sep1 = A&A Be fe See (10.8.15) 
and recall that 
fla) — flex) = (a — en)f' (ee) + 3(a — 2)°f"" (Ee), 
where & lies between z and a, if f(z) is continuous in that interval, so 
that (10.8.15) becomes 








vt &,) 
a — 41 = —3(a — an) Ge (10.8.16) 
Thus, if the iteration converges to a, there follows 
1 vt 
pra ene so Gee (10.8.17) 


when k is sufficiently large. 

It is important to notice that here the error in Z41 tends to be propor- 
tional to the square of the error in z, as k— ©, whereas in the other 
methods so far considered the two successive errors generally tend to be 
in a constant ratio, if the iteration converges. We say that such an 
iteration is a second-order process, whereas the preceding methods gen- 

rally are first-order processes. 

: If ‘this cae is bas to (10.8.6), the recurrence formula (10.8.14) 


becomes 
ag—a—1 _ +1 


Ze+1 = Ze — ~ Bee — | _ Bak ae 
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and, with zo = 1.8, the results of the first two iterations are 21 = 1.325 
and z, = 1.824718, when rounded to the places given. 

Use can be made of (10.8.16) to predict in advance the probable num- 
ber of correct digits in each iterate. For, since here f’’/f’ has a value of 
about 2 when x = zo = 1.3, it may be expected that the coefficient of 
(a — 2x)? in (10.8.17) will have a value approximating —1, so that the 
error « in the kth iterate will be approximately the square of that in the 
preceding iterate, and will be of negative sign. If, initially, it is known 
that the true value lies between 1.3 and 1.4, and hence that zo is in error 
by less than 0.1, it can be predicted that z, will be in error by less than 
about 0.01, so that three places would be retained. With eo reestimated 





Fig. 10.4 


as 21 — 29 = 0.03, there follows |e:| ~ 10%. Hence e2: may be expected 
to be less than about 10~*, so that six places would be retained at that 
stage. A comparison of 2; and z, confirms the earlier prediction (although 
this method of error estimation may be undependable in early stages, in 
other cases) and suggests that the error in the next iterate z3; will be in 
about the twelfth decimal place. 

If the curve representing y = f(x) possesses turning points or inflections 
in the interval between the initial estimate « = zo and the true root x = a, 
or between zo and 2;, the iteration may not converge to a, as is illustrated 
in Fig. 10.4, although it may well converge to some other root. However, 
if f’(x) and f’’(x) do not change sign in the interval (z,«), and if f(zo) and 
(eo) have the same sign, so that the iteration is initiated at a point at 
which the curve representing y = f(«) is concave away from the x axis 
(as, for example, in Fig. 10.3), it is easily seen, by geometrical considera- 
tions, that successive iterates must tend to x = « and that they all lie 
between zo and a, If f(#o) and f’’(#o) have opposite signs, the first iterate 
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21 is on the opposite side of a and convergence is uncertain unless f(x) 
and f’’(x) also do not change sign at x = a@ or in the interval (z1,a), in 
which case convergence then follows as before. 

The methods outlined in this section can be combined and modified 
in various ways. In particular, if f’(z.) begins to change slowly with k 
after (say) r iterations, the Newton-Raphson procedure may be modified 
by taking y, = f’(z-) for all k 2 r. The method of false position may 
be modified, for example, by taking +; as the slope of the secant PoP, for 
all k, where Py and P; are two fixed points on the curve y = f(x), near to 
and separated by the point P at which x = a, or by taking 7; as the slope 
of the secant PoP;, where Po is an appropriately chosen fixed point on 
the curve. Whereas such modifications lead to appreciable reductions 
in labor, their use clearly may also reduce the rate of convergence, or 
may instigate divergence, if the condition (10.8.13) is violated when k is 
large. 

Also, the methods are readily generalized to the treatment of two or 
more simultaneous nonlinear equations (algebraic or transcendental). 
Thus, for example, the two simultaneous equations 


f(a,y) = 0, g(x,y) = 0 (10.8.18) 
can be written (in various ways) in equivalent forms 
x = F(x,y), y = G(a,y), (10.8.19) 


and the method of successive substitutions can be based on the recurrence 


formulas 
epi = F(axe,yx), Yer = G(xr,Yx)- (10.8.20) 


When the iteration converges to the true solution pair, say « = « and 
y = B, it can be shown that the errors in the kth iterates tend to be 


described by the relations 

a— a ~ Adi+ Birk, B-— ye ~ Ad + Bad} 
where Aj, A», Bi, and Bz are constants, independent of k, and where A; 
and Xz are the roots of the equation 


\—-F, —F, | _ 
ao. AS 
or 2 — (Fz + G)d\ + (F.G, — FG2) = 0, (10.8.21) 


with the partial derivatives evaluated at (a,8), if F.G, ~ FG, at that 
point. The constants A;, Bi, and Ae, Be will be conjugate complex if 
the same is true of di, As. Thus the iteration will be asymptotically 
stable at (a,8) if and only if the roots \; and , are smaller than unity in 
absolute value, the necessary and sufficient conditions for which are 


IF. +G,| S$ F.G, —FG.+1<2. (10.8.22) 
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A more stringent pair of conditions, which is sufficient (but generally not 
necessary) for asymptotic stability, is of the form 


Fe) +1F,.<1, G@.| + |@,| <1. (10.8.23) 


As before, these conditions are not sufficient for convergence, in that the 
iteration may fail to converge even though they are satisfied, unless the 
iteration is started with (2o,yo) sufficiently near (@,B). 

The Newton-Raphson iteration, as applied to the solution of (10.8.18), is 
based on the result of replacing (a,8) by (ae41,Yx41) in the right-hand 
members of the Taylor expansions 


0 = f(a,B) = f(xe,yn) + (a — ae)fe(te,ye) + (8 — Ye fy(CasYe) +--+, 
(10.8.24 
0 = 9(a,8) = g(ai,yn) + (oe — tu) ge(te ye) + (8 — ysdgy(tuye) t- > - 


and neglecting nonlinear terms in 24; — a, and Ye+1 — Yr, SO that the 
recurrence formulas are of the form 


(e+1 — Te)fel(*e Ye) + (Yerr — Yafy(teYe) = —f(te,yr), 


(Te+1 — @x)Gr(@e,Ye) + (Yr — Yr) Gy(LyYu) = —@(XK,Yr). (10.8.25) 


Rather than resolve these equations for T+1 and yx41, it is usually con- 
venient to solve them, as written, for the corrections Ax, = Te41 — 2% and 
Ayk = Ye+1 — Yx, which are to be added to a, and y; to yield the following 
iterates. When the iteration converges, the errors in the (k + 1)th 
iterates generally tend to become linear combinations of the squares of 
the errors in the kth iterates (that is, the iteration is a second-order 
process), whereas, in the method of successive substitutions. based on 
(10.8.20), the new errors generally tend to become linear éombinations 
of the preceding errors themselves. 

When the so-called Jacobian determinant of f and g, 


of of 

_ |Ox dy 
J= ag ag|’ (10.8.26) 

Ox Oy 


vanishes at the point (a,y,), the equations (10.8.25) do not possess a 
unique solution. More generally, if J vanishes at or near the point 
(a,8), slow convergence or divergence of the iteration may be anticipated 
and modified procedures are generally needed (see Milne [10]). 

10.9. Iterated Synthetic Division. When f(x) is a polynomial, of 
degree n, so that the equation to be solved is an algebraic one, methods 
such as those of the preceding section can be systematized by the use of 
synthetic division, 
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For this purpose, suppose that 
f(x) = a™ Fawr +--+ + Onit + an, (10.9.1) 


and, first, let f(x) be divided by the linear expression z — 2, so that 


=a™ tar t+ - ++ + Ani + On 
ne? = te = Aart + bar? +. + + Bnet + bas) + FR, (10.9.2) 
where a"! + - - - + b,_1 represents the quotient, and F is the ee 
remainder. Here the coefficients bi, .. « » Ona and the See , 
dependuponz. By settingx = z in (10.9.2), it follows, in particular, tha 
R = f(z). (10.9.3) 
If now the quotient in (10.9.2) is again divided by x — z, so that 
nl ae ee + bn_-2& + Dn—-1 
i siti ares + cey"—3 + a 8 + Cn—30 + Cn—2) + R’, (10.9.4) 
and hence 


f(z) = (« — 2a + ert + + + + ena) + (@ — 2)R’ + RK, 


there follows also 


R' =f'(2), (10.9.5) 


and, indeed, if the process is repeated k times, it is easily seen that th 
’ 
ainder R™ is then f® (z)/k!. yg 
The method of synthetic division is based on the fact that, by equa 
coefficients of x*-1, x"-*, . . . , z, and 1 in the two members of (10.9.2), 
’ ee 

we obtain the relations 
a, = b1 — 2, ay = be — zhi, ey 
an-1 = br-1 = 2Dn_2, an = R as 2Dn_1. 


Thus, if we introduce the recurrence formula 


by = Oe + 2be-1 (kK=1,2,...,n) 


ith bo = 1, (10.9.7) 
wi | 
it follows that this formula will generate the coefficients of the quotient 
of (10.9.2) withk =1,2,...,»—1, and also that 


R = f(z) = bn = Gn + 2bn-1. 


i forki= ly 2) ut nie 
Further, the c’s in (10.9.4) are related, (2s 
the b’s as the b’s are related to the a’s, and there follows also 
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columns (or rows), in the form 


1 1 1 
a1 bi Cy 
a2 be C2 


Qn-2 | bn-2 Cn—2 
Qn-1 bn R’ 
an R 
so that each element is obtained by addin 
times its upward neighbor. 
Thus, if the roots of the algebraic equation f(x) = 0 are x = ay, an, 
- » @, and if the Newton-Raphson procedure were to be used to 
approximate a1, starting with an initial approximation z, the next approx- 
imation, say z*, would be given simply by 











g to its left-hand neighbor z 


eee > (10.9.10) 


and the process would then be repeated with z replaced by z*. This 
method of tabulation avoids much of the labor involved in evaluating the 


polynomials f(z) and f’(z) by calculating powers of z and forming linear 
combinations of them. t+ 


In the simple case of the cubic equation (10.8.6), for which 


f(z) = 29 — 2-1, 


the first two iterations (starting with z = 1.3) would be tabulated as 

















_ follows: 
z= 1.3 1.325 
1 1 1 1 1 
0 1.3 2.6 1.325 2.65 
-1 0.69 4.07 0.755625 4.267 
—1 —0.103 0.001203 
Az = 0.025 —0.000282 


The approximation obtained at this stage is thus 1.324718, in accordance 
with the results obtained in the preceding section. 

Once the iteration is terminated, so that one zero of f(x) is approx- 
imated and the last entry in the b column is effectively reduced to zero, 
the remaining entries in the b column are (approximately) the coeffi- 





cients of the reduced polynomial, of degree n — 1, whose zeros are the 
remaining zeros of f(x). 






Rf’ = f’) = Ch—1 = Ona + 2ln—2 


t This procedure for applying the Newton-Raphson iteration to the solution of 
For actual calculation, it is convenient to arrange the entries in p 


algebraic equations is sometimes known aa the Birgo-Vieta method, 
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A simpler procedure, due to S. N. Lin (see [135] and [136]), is based on 
the fact that, in virtue of (10.9.8), the condition f(z) = 0 is equivalent to 
the condition a, + zb,1 = 0. That is, if and only if the assumed value 
of z were a root of f(x) = 0, then the corresponding value of bn (which 
depends upon z) would be such that z = —a,/ bna(2). Lin s iteration is 
the result of applying the method of successive substitutions to the equa- 
tion written in this form, so that the revised estimate z* is defined by 


the formula 





gt = — — (10.9.11) 
n—1 
An + 20n-1 
and hence e*—Z2=—- any ae 


or, equivalently, in virtue of (10.9.8), 
R 


n—1 





=Z- (10.9.12) 

b 

In this method, the formation of the c column is avoided, so that thi 

labor per iteration is reduced by nearly one-half. However, if t 

method is applied to the example treated above, the first three iteratio 
may be obtained as follows: 



















z= 1.3 1.45 0.91 —5.8 
4 1 1 1 
0 1.3 1.45 0.91 
-1 0.69 1.102 —0.172 
-1 —0.103 0.598 —1.157 
Az = 0.15 —0.54 —6.7 





Clearly, the iteration is not convergent in this case. 
In order to investigate the Lin procedure more closely, we may no 


that, since (10.9.8) gives 
b,1(2) = FO, 


the recurrence relation written in the form (10.9.11) can also be put in 
form 
* 


Thus Lin’s method is equivalent to applying the method of success 
substitutions to the result of writing f(z) = 0 in the form 


AnX aul 
os er corre 
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In the example just considered, (10.9.13) becomes x = 1/(x? — 1), 
which, as was seen in §10.8, is not suitable for successive substitutions 
since the convergence factor F’(x) has a value of about —5 near the real 
root, whereas, for convergence, its absolute value should be smaller than 
unity. In confirmation, we may notice that the error in z* = 1.45 is 
indeed about five times the error in z = 1.3 and is of opposite sign. 

More generally, we find from (10.9.13) that 


ton < , af (@) — I) + an 
F'(x) = an (f@) - aj? 


and hence, at a zero a, of f(x), Lin’s method possesses the asymptotic 
convergence factor 
Py 1, = Or py = f(a) 
p=F'(a) =1+ a) (ar) =1+a, F(0) (10.9.14) 
Thus the result of applying Lin’s iteration to a good approximation to 


a, Will lead to a poorer one unless |p| S$ 1. That is, a necessary condition 
for convergence to a, is that 








igh F +“ 7"(a)| <1. (10.9.15) 





This criterion is a useful one if a rough approximation to a, is known 
initially, unless f’(~) varies rapidly near x =a,. If we recall that 
Gn = (—1)"aiaq +--+ a, and that f’(a,) = (a, — a1) -- + (a, — an), 
where the factor (a, — a,) is to be omitted, we may deduce that (10.9.15) 
can also be expressed in the form 


= )1-[0-2)0-2)- (2) cr con 


in terms of the remaining roots of f(x) = 0. 
It is of interest to notice that, since 
R= an _ fe) — 10), 


ba-1 SS A = 
z z 


it follows that b,_, is the slope of the secant joining the ordinate at « = 0 
and the ordinate at z = z. Thus, (10.9.12) is equivalent to the result of 
taking 7 as the slope of that secant in the more general recurrence rela- 
tion (10.8.9), and the Lin iteration therefore amounts to determining z* 
by linear interpolation (or extrapolation) based on the jixed ordinate 
/(0) and the most recently calculated ordinate f(z) (see Fig. 10.5). Also 
the requirement (10.9.15) is easily interpreted as demanding that the 
ratio of the slope of the curve at ? to the slope of the secant PoP? be posi- 
tive and less than two, 




























































































456 INTRODUCTION TO NUMERICAL ANALYSIS 





Fra. 10.5 


In the case of the equation 


zt — 8x3 + 232? + 16x — 50 = 0, 









is easi = land = 2. If Lin’site 
a real root is easily seen to lie between x i 
tion is used, starting with z = 1.5, the results of the first three itera 


are as follows: 




















Z2= 1.5 1.39 1.421 1.4125 
1 1 1 1 
—8 —6.5 —6.61 —6.579 
23 13.25 13.8121 13.6512 
16 35.875 35.1988 35.3984 
—50 3.8125 —1.0737 0.3011 
Az = —0.11 0.031 —0.0085 


The true roots of (10.9.17) are + /2 and 4+ 37. The rate of © 
vergence of the Lin iteration, in this case, might have been predict 
advance by approximating a, by 1.5 in (10.9.14) to obtain p = 
The asymptotic convergence factor is —0.25, to two places. ft 

When the Lin iteration is unstable in the neighborhood of a zero 
a polynomial f(x), stability can be attained by translating “ ori 
anew point x = ¢, if that point is sufficiently near to a,. For, if we 

+ When the ratio of successive values of Az, becomes nearly constant, that 
serves as an estimate of p, and a generally improved value of Az» then is 
(Aen) (1 +p tp? + +++) = (Ae)/(L — p). 


"errors relevant to the roots of f(x) = 0. 
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the change of variables x = ¢ + c, and apply the Lin iteration to the new 
polynomial 


fOHftt+oO =r +ari+...+a_u+a, 


reference to (10.9.13) shows that this process is equivalent to the applica- 
tion of the method of successive substitutions to the result of writing 
F(® = 0 in the form 


ee. ee eee: |||) re 
PRETO FGEO =e =: 


Thus, at a zero t = a, — ¢ of f(t + c), we may obtain the expression 
¥ 
cae, -/) Eee eee ee es 
p = F(a, — ce) = 1— (ec — a,) fo 
for the asymptotic convergence factor. Since f(ar) = 0, we have also 


f(o)/(e — ar) = [f(e) — f(ar)]/(e — ar) = f’(é), where £, lies between a, 








andc. Thus there follows 
p=1—- ee (£ between a, and c), 


and this factor is near zero when c is sufficiently near «,. 

Hence, if ¢ is a fair approximation to a,, and if f’(x)/f’(a,) is not small 
between x = ¢ and z = a,, asymptotic stability generally will be intro- 
duced by translating the origin to x = c. Clearly, this process amounts 
to using f(c) as the fixed ordinate, in place of f(0), in Fig. 10.5. 

In those cases when the coefficients M1, ... , dn of f(x) are inexact, it 
is desirable to have upper bounds on the corresponding inherent possible 
If a, is obtained as a root of the 
equation 

Flar) = af + aot + -- - + are + an = 0, 
whereas the true coefficients are a, + a1, » « « » An + dan, then the corre- 
sponding true root a, + da, must satisfy the equation 
(a + bar)” + (a1 + ba1)(a, + da,)"! +... 
+ (n—1 + ban_1) (a, + da,) + (an + ban) = 0. 
If the first equation is subtracted from the second, and if it is assumed 


that the relative errors are sufficiently small to permit neglect of second- 


order terms, it follows that, to a first approximation, 6a, must satisfy the 
equation 





[nap—? + (n — jaar? +... + Ani] bay, 
+ af! 8a, + af? bag +... + ba, = 0, 
Nem Nom 
and hence Bary me afm ba; + a? = oar + sob bn, (10,9,18) 
f (ay) 
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In particular, if each coefficient is known to be in error by less than ¢, 


labtie Gal, 2 tec; (10.9.19) 


there follows, within the same degree of approximation, 








{+ lar| + \orr|? +... + \o,|"—} : 
|Sar|max © ie 
eae ea 10.9.20) 
or |80¢,| max = (Ja,| _ 1)|f’(a,)| ( 


Thus, in the case of the real root of (10.8.6), it is found that errors q 
ivapnitude ¢ in the coefficients would Seki a maximum a 
i in the root, if the errors are small. 
ery nearly the same magnitude in ; 
ee ie root x = +~/2 of (10.9.17), the maximum error in the on 
found to be about one-sixth of the maximum error in the coefficient s. 
The preceding methods are valid, in principle, for the earn 
However, since a real initial ap 
mplex roots as well as real ones. , 
aon leads necessarily to real iterates, when the oer _ " 
initi ith a complex initial estimate, 
rocess then must be initiated wit : 
asap with complex numbers are involved in each step of the proc , 
When the coefficients are real, the complex roots occur ie con] a 
it i loit this fact by seeking quadratic: 
nd it is generally preferable to exp 
ee nee than linear complex ones, by a generalized method of s. 
thetic division. 7 
Before treating such processes, we present two additional genel 
methods which each possess certain favorable properties. pail 
10.10. Bernoulli’s Iteration. A method, originally due to Danie 
noulli, for obtaining roots of the algebraic equation 


ota tte + dnt + dn = 0, 


is based on the related recurrence formula 


fe + Giver t+ + Gn—iMe—nt1 + Gnbe-n = 0, (10.10 
i } 10.10.1). 
having the same coefficients as ( 
If Ne roots of (10.10.1) are a1, a2, - + + 5 ny and if Oe 
sidered as a difference equation, its general solution is found to 


pe = Crok + Coad + Crak + oes + Chak, (10.10: 


{If a solution of (10.10.2) is assumed in the form ed = ak, 4 pp ee pe 
haracteristic equation determining admissible values of a is 0 perey 
(0 1). Thus at, at, . . . ,a*are all solutions, and superposition leads F 
Bae ge be dita ie represent the most general solution, if no roots are re 
when only integral values of k are considered, 
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where the n C’s are constants, independent of k, which are determined by 
the values of yo, wi, . . . , and wa_s, if no roots are repeated. Under this 
assumption, let the roots be numbered in decreasing order of magnitude, 


so that a; here denotes the largest root of (10.10.1). Then since (10.10.3) 
can be written in the form 


Cy a2 . C3 a3 . C Qa ] 
— k es hea aE a pes eee aoe ie d ’ 
be = Cat [3 +r C, (“) + C, (2) + + Gina (10.10.4) 


if Ci ¥ 0, it follows that, in any sequence generated by (10.10.2), the kth 
term is approximated by Cia} as k > © and, indeed, that the ratio 


(10.10.5) 


tends to a, ask— o, if the largest root a, is real and unrepeated and if 
no other root has equal magnitude, unless yo, 11, 
that the coefficient C, of a¥ in (10.10.3) is zero. 

If the largest root a; is complex, and the coefficients of (10.10.1) are 


real, then a: is the complex conjugate of a; and is of equal magnitude. 
If we write 


- » Mn—1 are SO chosen 


a = &) + im = Bie*, a = £) — in, = Bye~*41, 


(10.10.6) 


where 8; > 0 and £1, 7, 8, and ¢; are real, the terms corresponding to 
a; and ae in (10.10.3) can be expressed in the real form 


Bi(C; cos kp; + C2 sin kd), 


_ if C, and C; are replaced by (C, — iC2)/2 and (Ci + iC2)/2, respectively, 


in (10.10.3). 


Thus, if a; and & are not repeated and if all other roots are smaller in 
magnitude than @,, it follows that 


be =~ BE(C; cos ky + Co sin kdy) (k— ow), (10.10.7) 


But, if yu, were given exactly by the right-hand member of (10.10.7), it 
would satisfy the recurrence relation 


Mk+1 — 2uxc81 cos ¢1 + Buri = 0, (10.10.8) 


and conversely, as is easily verified. A second relation, involving the two 


real unknown quantities 6; and ¢,, would then be obtained, by replacing 
k by k — 1, in the form 


Me — 2up—181 COS $1 + Bux» = 0. 


The result of eliminating cos #; from these two relations is 


(10.10.9) 


(ub, — Malden dBP wh — deyrmenr, (10.10.10) 
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whereas the result of eliminating 67 is 


2(uea — Uxbe—2)B1 COS 1 = Mebe—-1 — Bk+1bk—2 (10.10.11) 


Thus, if we introduce the definitions 


Sk = BE — Bet ibk—1) th = Pebe—1 — Mk+iMe—2s 


these relations become 


2 2+ 72 = Sk_, 
Bi = n1 fac 


28, cos di = 261 = —. 


It follows that, unless it happens that C, =C,=0 in (10.1 

i ice of po, Hi) + + + 9 Baby - 

because of a very special choice of Ho, #1, ratios am 

and t,/sp—1 Will tend to 63 and 26; cos $1 as k— @, from which ae 4 

constants 6, and ¢1, or £1 and m1, specifying the desired dominant comp 

ir i ted. 

oot pair in (10.10.6), can be calcula de : q 
: If - is a repeated real root, of multiplicity two, so that a2 = 1, 


all other roots are of smaller magnitude, then the combination of term 


orres ondin to ay and a2in 10.10 3 1S of the form ay Ci f c k . Sin 
bs ) ( 2 ) 
g 2 ( 
Cc p 


to satisfy the relation 
. apa — 2Meo 1 pra? = 0, (10.10. 


AO ss . 5 a 
roximation to a1, which tends to a1 as | 
pera rian yeh ee f the two roots of this eq) 


i iate one O 
could be obtained as the appropria a 
tion, the solution of a quadratic equation can be avoided by rewrl 


(10.10.14) with k replaced by k — 1, and eliminating a} from the | 


relations, to give 


te (10.10. 


= —? 


aa a Sk—-1 


ith the notation of (10.10.12). 
eer exceptional cases, in which several root 
absolute value, can be treated in a similar way. 

When the largest root 1 is real and unrepeate rp 
roots with the same absolute value, the ratio rx ten A seni i 
of the convergence depending wee the reir mere er 

rgest roots. If a: and a2 are con) (10.10.07 

fet n ua tend to oscillate about the value zero ett a it 
the oscillation may comprise several iterations), ri ap a. a 
for a: ~ aj, the convergence of the ratio 7, to a, will be a o “ 
ratio t/s,-1 converges more rapidly to 2oe {or to a Ah? 
everal iterations, the behavior of the sequence of 78 & 
and recourse can be had to 


s have the same max! 


d, and there are no of 
the rapie 
/e 


manner, after 8 
ally will indicate the true situation, 
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appropriate one of (10.10.13) and (10.10.15) when that sequence is not 
convergent. The more complicated situations seldom occur in practice. 
If a, is real and unrepeated, the ideal situation would be that in which 
lo, Hy + + + ) Hn—1 Were so chosen that C2 = +--+ = C, = Oin (10.10.3), 
so that uo, m1, . - - , Hn-1 would be respectively proportional to 1, a1, 
., a%1, The first calculated value of r, tn = Hn/Mn—1, then clearly 
would be identical with a;. In such cases, the starting values could be 
taken efficiently as successive powers of a previously determined approxt- 
mation to a. If no information is easily available with regard to the 
nature of the largest root or roots, the starting values 


Mo= b= + = bn = 0, Mn-1 = 1 


are often convenient. For this set of values it is easily seen that the 
undesirable case C; = 0 cannot occur. 

A particularly notable set of n starting values, having the same 
property, is that determined by use of the formula 


br = —(Gaplr—a + oppo + + + + + Gai + 7a,) 
(r=1,2,..., 7), (10.10.16) 


with wo = yw1=-:°:* =0. For this set of starting values it can be 
shown that all the C’s in (10.10.3) are unity, and hence that y; is then 
identified with the sum at + af + - ++ + a% for allk 21. Thus, in 
particular, if Jai| >> lae|, . . . , |an|, there then follow both a1 ~ mx/ur1 
and a ~ ui/* when k is sufficiently large. With the convention that 
a, = 0 when r > 1, it is seen that the recurrence formula (10.10.16) 


- reduces to (10.10.2) when r>n. This special procedure is closely 


related to the Graeffe procedure described in the following section. 

In the case of the example (10.8.6), the Bernoulli recurrence relation 
is merely yx = we—-2 + mes. If the iteration is begun with the starting 
values 1.30, 1.69, and 2.20, about 16 iterations are needed to establish the 
real root 1.3247 . . . to five significant figures, although each iteration 
requires very little labor. The remaining roots are complex, with an 
absolute value of about 0.9, so that the ratio of the magnitude of the dom- 
inant root to that of the subdominant root pair is about 1.5. The rela- 
tive slowness of the convergence is due to the relative nearness of this 
ratio to unity. The fact that the subdominant roots are complex causes 
the sequence of iterates to tend to its limit in an oscillatory manner. 

In the case of the example (10.9.17), the recurrence relation is 


be = Spar — 2Byp—2 — 16px—a + 5Opx—< 


{The proof depends upon certain properties of symmetric functions [see equation 
(8,.14,28)], 
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and, with the arbitrarily chosen starting values 0, 0, 0, 1, the ensuing 
calculation is as follows: 





Bk Tk Sk te Sk/Sk-1 te /Sk—-1 
8 8 23 8 —_ —_— 
41 5.12 657 200 28.565 8.696 
128 3.12 16261 5224 24.750 7.951 
3 0.02 406537 130600 25.001 8.031 
—3176 —1059 10163401 3251272 25.000 7.997 
—25475 8.02 


From the irregular behavior of the r sequence, it may be deduced that 
either the process has not yet begun to converge satisfactorily or there 
is a pair of dominant complex roots. To test the second hypothesis, t 
s and ¢ sequences are constructed, and the convergence of the sequence 
of ratios in the last two columns is evident. The true dominant roots a 
t: + im = 4 + 37, so that 6? = & + nj = 25 and 2, = 8. The appr 
imations afforded by the four successive pairs of ratios are 4.348 + 3.10 
3.976 + 2.9902, 4.016 + 2.9797, and 3.998 + 3.002z. 

The Bernoulli iteration has the useful property that it yields the dom 
inant root (or roots) regardless of the starting values {except in t 
unlikely case when (10.10.16) is satisfied, in which case another root. 
root pair will result]. That is, it is not necessary to initiate the iteration 
with a sufficiently accurate approximation, as is the case for many ot 
iterative methods. ‘This fact is of particular importance in those ¢ 
when only complex roots are present, since even rough approximatio 
then are not readily obtained. The calculation is remarkably sim 
(and readily mechanized) when the dominant root is real and unequal 
in absolute value, and is not unduly complicated otherwise. 

10.11 Graeffe’s Root-squaring Technique. Graeffe’s iterative meth 
for determining roots of the algebraic equation 


f(a) = a" Faye + age? +. Fane +a, =0 (1011, 


consists in forming a sequence of equations, such that the roots of @ 
equation are the squares of the roots of the preceding equation in t 
sequence, for the purpose of ultimately obtaining an equation whose ro 
are so widely separated in magnitude that they can be read approxima 
from the equation, by inspection. 

The principle of the method can be illustrated by a consideration of 
general equation of fourth degree, which can be written in the form 


f@) 


xt + ayx? + ax? + agu + a 
(x — 0)(a — a2)(« — as)(w — ou) = O 


(10.11 




































NUMERICAL SOLUTION OF EQUATIONS 


or, equivalently, 





f(z) = zt — (a, + ae tas + a4) x8 
+ (aia. + areas + aay + asa3 + array + arg0r4) x? 


— (ara2a3 + anaras + ayaza, + agaza4)t + ayaraza4 = 0, 


(10.11.3) 


where ai, a2, a3, and a, are the roots. 
If the roots are all real and are widely separated in magnitude, so that 


|ax| >> |a2| >> |as| >> lal, the result of retaining only the dominant part 
of each coefficient in (10.11.3) is 


4 3 
x ax + 0lyav? — Aj AoA3X + Aj;A2h304 ~ 0. 


(10.11.4) 


Thus the four roots are given approximately, in this case, by equating to 
zero the four linear expressions x + ay, ait + do, aor + as, and asx + ay. 
If, say, a, and a, are conjugate complex, so that a, = B,e* and 


ay = Bie~**:, and also |ax| = |e2| >> |as| >> |as|, the approximation replac- 
ing (10.11.4) is then 


x* — 26,25 cos $1 + Bix? — Biase + Blazsayg ~ 0. (10.11.5a) 


The complex roots are then approximated by the zeros of the quadratic 
x? + a,a + ds, and the remaining roots are found by equating aor + a; 
and a3x + ag to zero. 

If, say, a1 = ay and |ay| = |a2| >> |as3| > |e4|, the approximate relation 
is 


4 
CS Qayx? + ary? = azasx + azasa, ~~ 0, 


(10.11.50) 


and the approximate roots are obtained in the same way. Other, more 


* unusual situations can be analyzed similarly. 


The root-squaring process itself is based on the fact that the product 


(—1)"f(—2)f() = (@? — af)(@? — of) - - - (a? — a2) (10.116) 
is a polynomial of degree n in x?, whose zeros are the squares of the zeros 
of f(x). Thus, if fiw) = a + aya"! + aga™? + - - - + anit + ay is 


multiplied, term by term, by 


(—1)"f(—2) = a — aya" + arr? — ... 
+ (—1)" ane + (—1)"an, 


and x’ is then replaced by «, the result f2(x) is a polynomial of degree n 
with zeros aj, ..., a3. By repeating the process, a polynomial f4(z) 
with zeros a}, . . . , a4 is obtained, then fs(x) with zeros a%, and so forth. 

If ail roots are real, unrepeated, and of distinct magnitudes, the itera- 
tion is concluded when the magnitude of each coefficient in an equation 
is the square of the magnitude of the corresponding coefficient in the 
preceding equation, within the tolerance adopted, Suppose that the 
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original roots are a1, . . . , &n, and that & root squarings are needed, so 
that the roots of the final equation are a%,..., a™, where m = 2*, 
If the final equation is of the form 


Soule) om ate Age -p Age t = oss 
+ (—1)"7A,12 + (—1)"A, = 0, (10.11.7) 
there then follows : 





3 
m ~~ mM my = am x — eee 
1 A,, O2 AS. : Ag ? 


Each of the right-hand members will be positive, and the proper sign mui 
be chosen for the real mth root of each of these expressions, by substit : 
tion of the two possibilities into the original equation or otherwise. 

A double original root a; would evidence itself by the fact that, af 
k root squarings, the equation would be approximately of the form 


Ina) © 2" — 2atPn") + admar? — (afas)ma"-? +--+ = 0, (1011 


where again m = 2*, so that the magnitude of the coefficient of at wo! 
tend to be half the square of the magnitude of the corresponding ¢ 
cient in the preceding equation. Similarly, if «, were a double real r 
the coefficient of x" would have this property. Thus a, would t 
satisfy both of the relations 

Arti A 


an = m we 10.11, 
oe Any oy 2A,-1 ( 
and would be determined as the real root, with appropriate sign, of ei 
equation. 
A conjugate complex root pair a1,2 = 6.e+*** would cause the kth eq 
tion to be approximately of the form 


where m = 2, so that the coefficient of 2-1! in the kth equation w 
tend to fluctuate in magnitude and sign in the same way as — 267! cos 
as k and m = 2" increased, and hence again would not tend to be 
square of the corresponding coefficient in the (k — 1)th equation, r 
same sort of oscillation would occur in the coefficient of «"~ if a, and 
were a complex root pair, and, for & sufficiently large, 6, and ¢, ¢ 
determined from the relations 
Ary 


ii a ed, 
B; Bai 


28" cos md, 





A, 
re (10,1 7 
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The magnitude 8, would thus be the positive real (2m)th root of 
A,41/A;-1, whereas the appropriate one of the values of ¢, obtained 
from the second relation would have to be selected, by trial and error or 
otherwise. 


When only one pair of complex roots is present, say 
Bret = §, + in, 
the selection of the appropriate value of ¢, satisfying this relation can be 
avoided by noticing that, since the sum of all roots of (10.11.1) is given 
by —d,, there follows 
a1 + a2 + OE Bia? + Or—1 + 2é, + Ar+2 + ye + An = —Ay. (10.11.13) 


Hence £, is given immediately when the remaining n — 2 roots are known, 


after which 7, is given by +/6?2 — £. 


If two pairs of complex roots are present, say 
Bete = £, + in, Brexits = & + ing, 
the corresponding relation is 
2(& + &) = —Qitart-++ taitayet:-- 
+ as-1 + asp2 + + + + +an). (10.11.14) 


A second linear relation between £, and £ is then obtained by recalling 
that the sum of the reczprocals of the roots is —@n—1/dn, so that 


and 


1 1 1 1 1 
a ee a eT 
Hee ees 
an An 


or, after rationalizing the reciprocals of the complex numbers and trans- 


posing terms, 





Li) 


ea &) ins (= 4 Be sata: Ae +), (10.11.15) 
where the reciprocals of the four complex roots are to be omitted in the 
right-hand member. Since the magnitudes 6, and 8, are known, the 
relations (10.11.14) and (10.11.15) comprise two linear equations for the 
determination of & and &,, after which n, = +/82 — & and, = +/6? — &. 

Rather than actually multiplying together the polynomials f(x) and 
(—1)*f(—2), in order to obtain the coefficients of the function f.(z), it is 
desirable to work with detached coefficients, and to obtain formulas 
relating the new coefficients to the original ones. For this. purpose, it is 
convenient to write 





f(a) = Aor” — Aw + Awe? we. > (=1)'Ayan, (10.11.16) 


ied 
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with the convention that A; = 0 when? > n. If we use this convention, 


there follows 


(—1)"f@)f(—2) 


ll 
Bos 
Y acts Y 
| 
is 
= 
> 
& 
3 
J 
Mm 
b 
@ 
; 


y (—1)' A, Aja, 
i=0 j=0 

Since clearly only the even powers of « will remain, we may write 

i+ j = 2k and, after changing the limits appropriately, we have 


(—1)"f(2)f(—2) = fola*) = y (—1)*AR(e*)**, 
k=0 


2k 
where At = y, (—D)4*AAn«. 


i=0 
Thus fo(x) is given by 


(a) = ¥ (-1)'Ate 
fox) > 


where 

A*t = A? — 2Ap-1 Anyi + 2A,-2Anse — 2Ar-sAnrs + °°» 

and where the series of products terminates when either the first subseri 

reduces to zero or the second increases to n. This formula is - 

because of the fact that the puree sed and A;,, involved in e 
mmetrically placed about Ax. 

sian iad may is illustrated by the simple case os " 

fq) =a —2-1 considered in (10.8.6), for which Ao “* . 

A, = —1, and A; = +1, in accordance with (10.11.16). Vy 


(10.11.18 


use of (10.11.18), the coefficients of the successive equations, again writ 
in the form 2? — Aix? + Ase — Az = 0, are obtained as follows in 


first six iterations: 





| 1 Ai Ag As 

1 0 -1 1 

- 1 2 Lat 
fir ed 2 -3 1 
fs | 1 10 54d 
fie | 1 90 5 1 
fs2 | 1 go90  —155 1 
fos | 1 65448410 7845 1 


The coefficient A; here remains fixed, whereas the coefficient A, inf 
the square of that in fs: to five significant figures. The pereistenniail 
tion of A indicates that the roots a and as are conjugate complex, 
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Thus the sequence of approximations to a; is 0, ~/2, ~/2, ~/ 10, ~/90, 
4/8090, 1/65448410,... , or 0, 1.4, 1.2, 1.33, 1.3248, 1.3247,... . 
The fact that the positive sign is correct would be determined most easily 
by noticing that f(x) changes sign between « = 1 and x = 2. Reference 
to the first equation of (10.11.12), taking into account the fact that 
here A; = 1, shows that the corresponding sequence of values of the 
magnitude 6: of the complex root pair consists of the reciprocal square 
roots of these numbers, so that the best available approximation is 
B. ~ 0.86884, to five places. Rather than use the second relation of 
(10.11.12), which would involve choosing the appropriate value of cos ¢» 
for which cos 64¢2 = 0.68263 from among 64 possibilities, we use 
(10.11.13) to obtain 2& = —a, and hence & ~ —0.6624. Finally, there 
follows m2. = W/@} — 8 = 0.5622, so that the approximate roots are 
1.3247 and — 0.6624 + 0.5622z. 

In order to illustrate the calculation involved in less simple cases, 
we display the results of five iterations as applied to the equation 
xt — 10x* + 35x? — 502 + 24 = 0, when only three digits are retained. 








Ao Ai Ag A3 A, 
f | 1.00 1.0001) 3.50(1) 5.00(1) 2. 40(1) 
fz | 1.00 3.00(1) 2.73(2) 8.20(2) 5.76(2) 
fe | 1.00 3.54(2) 2.65(4) 3.58(5) 3.32(5) 
fs | 1.00 7.23(4) 4.49(8) 1.11(11) 1.10(11) 
fis| 1.00 4.33(9) 1.86(17) 1.22(22) 1,21(22) 
fox | 1.00 1.84(19) 3.45(34) 1.49(44) 1.46(44) 


Here an integer in parentheses following a number represents the power 


of 10 by which that number is to be multiplied to give the desired coeffi- 


cient. The entries are obtained simply by use of (10.11.18). 
ple, the coefficients in fis may be calculated as follows: 


For exam- 


A, = 108[(7.23)2 — 2(1.00)(4.49)], 
Ay = 10[(4.49)? — 2(7.23)(1.11) (10-4) 4+ 2(1.00)(1.10) (10-5)], 
A, = 10°*{(1.11)? — 2(4.49) (1.10) (10-*)], 


Ag = 10?2(1.10)2. 


In a sixth iteration, the squared term in (10.11.18) obviously would 
not be modified to three digits by the product terms in any case, so that 
the iteration is terminated. Here all roots are clearly real, and the 
application of (10.11.8) to fx: yields the approximations a; ~ 4.000, 
a2 © 3.001, as © 2.000, a; ~ 0.999. The correctness of the positive 
signs is assured here by the fact that the expression for f(—2) involves 
only positive coefficients, so that no negative real roots can be present. 

It is of some interest to notice that the use of (10.11,8) at earlier stages 
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of the iteration would yield the following sequences of approximate roots: 








1 2 3 4 
f 10.000 3.500 1.429 0.480 
fe 5.477 3.017 1.733 0.838 
fs 4.338 2.941 1.917 0.981 
fe 4.049 2.979 1.991 0.999 
fis | 4.002 3.000 2.000 1.000 
foo | 4.000 3.001 2.000 0.999 


The Graeffe method possesses the theoretical advantages that the itera- 
tion leads to all zeros of f(x) at the same time, and that (as in the Bernoulli 
iteration) there is no question of the existence of ultimate convergence, 
if appropriate attention is paid to the control of round-off errors. How- 
ever, it is often rather laborious, and the extraction of algebraic roots of 
high order, which is involved in the process, is conveniently effected in 
machine calculation only by an iterative process (see Prob. 36). 

A serious disadvantage follows from the fact that a gross error co 
mitted at any stage of the calculation invalidates all subsequent calcu 
tions, whereas the other iterative methods considered would suffer onl; 
a reduction in the rate of convergence. 

Rather than use this method for the complete determination of t 
roots, it is often convenient merely to iterate sufficiently to obtain cru 
approximations, when such approximations are not easily obtained 
other methods, and then to improve these approximations by simpler 
more rapidly convergent methods. 

The root-squaring process is also useful in connection with the Berno 
iteration, in cases when that iteration appears to converge slowly, si 
the rate of convergence increases with increasing values of the ratio 
the magnitudes of the dominant and subdominant roots. Thus 
convergence will be improved if the original equation is replaced 
one whose roots are, say, the squares or fourth powers of the origi 
roots. 

10.12. Iterated Synthetic Division with Quadratic Factors. Lin It 
tion. Among the most troublesome algebraic equations, in practice, 
those which possess two or more pairs of complex roots. Whereas 
methods of §§10.10 and 10.11 can be used in such cases, and will al 
generate convergent sequences of approximations, the convergence 
often slow and the labor involved may be excessive. We next treat t 
methods which are similar to those considered in §10.9, but in w 
successive approximations to a quadratic factor are generated, 
methods have the property that the iteration may not converge 
the initial approximation is sufficiently good and, in fact, one of them 
not yield a convergent sequence even in that case. Thus, in troubl 
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cases, the use of the Bernoulli or Graeffe iteration may be desirable in 
order to afford a reasonably good initial estimate. 
If the polynomial 


f(z) = a* + ar + + fant + an (10.12.1) 
is divided by the quadratic expression z? + px + q, so that 
f(z) = ae + ayer + - + + One + An 
= (2? + px + q)(x™? + biz™? + «+ + dp-st + bn-2) + Ra + 8, 
(10.12.2) 


the requirement that this expression be a factor of f(x) imposes the two 
conditions 


R=0, S=0, (10.12.3) 


where R and S are the coefficients of the linear remainder and are certain 
functions of the parameters p and gq. 

In order to obtain a simple method for obtaining R and S without 
actually effecting the long division, we equate coefficients of like powers 
of x in the two members of (10.12.2) and thus obtain the relations 


a,=b,4+ 2p, a2 = bo + phit+ gq, as = b3 + pbe + ghi, 
PP Gy ay = be + por—1 + qb—2, | 
An-2 = bn» + pbn—s + qQvn—s, an-1 = R + pbn—2 + gbn_s, nas) 
Qn = S + qhn-2. 
Thus, if we introduce the recurrence formula 
; by = a — pbr_1 = qQox—2 (k = 1, 2, . . ,n) (10.12.5) 
with bi = 0, bo = 1, (10.12.6) 


it follows that this formula will generate the coefficients of the quotient 
in (10.12.2) with k = 1,2, ... , n — 2, and also that 


R = ba-1 = Gn-1 — Pbn-2 — qhns, 


(10.12.7) 
S = Dn + pon =i — Qon—2. 


(10.12.8) 
Hence the expression x? + px + gq will factor f(x) if and only if 


k= An-1 — pon» = qon—s =0 


ee amin rl (10.12.9) 


and 


Lin’s iteration consists in applying the method of successive substitu- 
tions to the result of rewriting (10.12.9) in the form 


we Ane = Wns An 
ree ret inl 
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so that “improved” values of p and q are defined by the formulas 








pha te es gt (10.12.10) 
4 h Dn—2 bn_2 
an ence 
; x = Qn—-1 — pbn—2 —— qon—3 ex oa an — qQon—2 
Pp Pp Dad ? q qd ae ? 


or, equivalently, in virtue of (10.12.7) and (10.12.8), 





R S 

pt =pt,— f= ess (10.12.11) 
n—2 n—2 
In analogy to (10.9.14), it is known that, if p and q are to be such that 
the zeros of «? + px + q approximate the true zeros a1 and a, of f(x), 
then the two relevant asymptotic convergence factors are (see Prob. 75) 
/ 

ep ee LN age = 


2 — Ay An 


oer f(s). (49.19.19) 
Q2— ay An 

That is, if either or both of these factors exceeds unity in absolute value, 
then one or both of the zeros of the modified expression 2? + p*x + q* 
generally will afford poorer approximations to «; and az than the zeros 
of the expression x? + px +q. Thus, if a + px + q is to converge to 
(a — ax)(% — az), it is generally necessary that 


lpr] <1, [pa] <1. (10.12.1 


In addition, it is necessary that the inztial estimates of p and q not diff 
excessively from — (a1 + a2) and aia, respectively. 

The result (10.12.12) is useful only if fair approximations to a pair 
roots can be obtained in advance. In analogy to (10.9.16), the conditio 
(10.12.13) can also be expressed in the form 


fe (eer tear a) am 


(10.12.14 


In the absence of preliminary information, the iteration may be sta 
with arbitrarily chosen values of p and q, in the hope that convergence 
some root pair (real or complex) will ensue. With the convenient ini 
choice p = 0, q = 0, the first iteration always yields the quadratic 

An-1 An 


et 4+ e+ 
An—2 An—2 





? 


whose zeros will approximate the two smallest roots of f(x) = 0 if th 
roots are sufficiently small relative to the others. It is seen that 
initial choice p = a1, q = 42, corresponding to the quadratic x* +> 
+ a2, whose zeros would approximate the two largest roots of f(z) = 0 
those roots were sufficiently separated in magnitude from the ot 


NUMERICAL SOLUTION OF EQUATIONS 471 


leads always to b»_2 = 0 in the first iteration, so that the following itera- 
tion then is undefined. This fact indicates that convergence of the Lin 
iteration to the largest root pair generally cannot be obtained when that 
pair is widely separated in magnitude from the others, as can be seen also 
by noticing that (10.12.14) then will tend to be violated. 

In the more general case, however, (10.12.14) shows that the possibility 
of convergence to the largest root pair, or to any other chosen root pair, 
depends in a fairly complicated way upon the configuration of all the 
roots. 

The calculation can be arranged in parallel columns, as follows: 


Gn—2 | bra 

Qn. | R 

Qn S 
Here each entry in the b column except the last is obtained by subtracting 
from its left-hand neighbor p times its first upward neighbor and gq times 
its second upward neighbor. (In calculating bo and bi, the missing entries 
are taken to be zero.) The last element (\S) is calculated in the same way 
except that its first upward neighbor is imagined to be replaced by zero. 
Finally, 


* 








R 
AGS DSP and Aqg=q 


In illustration, the quartic equation 


f(x) = x4 — 8x + 39x? — 62x + 50 = 0 (10.12.15) 


possesses the complex roots 1 + i and 3 + 4%, and f(z) is factorable in 
the form f(x) = (x? — 2x + 2)(22 — 6x + 25). The first steps in a Lin 
iteration, assuming ignorance of this information, and starting with 
p = q = 0, may be tabulated as follows: 




















p= 0 +1.6 —1.95 —2.009 | —2.008 | —2.003 — 2.0007 
q= 0 1.3 1.82 1.970 2.001 2.003 2.0012 
1 1 1 1 1 1 1 
— —8 —6.4 —6.05 —5.991 | —5.992 | —5.997 
39 39 27.5 25.38 24.994 | 24.967 | 24.985 
—62 —62 —9.7 —1.50 0.015 0.124 0.057 
50 50 14,2 3.81 0.762 0.041 | —0.045 
Ap = —1.6 | —0,85 | —0.059 0,001 0,005 0.0023 
aq 1.8 0,62 0,160 0,081 0,002 | —0,0018 
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Thus, at this stage, the approximate factorization is 
f(x) = (a? — 2.001a + 2.001)(a? — 5.997x + 24.985). 


The Lin iteration technique is perhaps the simplest known method 
for the numerical solution of algebraic equations, when two or more pairs 
of complex roots are present. However, it possesses the disadvantage 
that convergence is not certain, even though the starting values are good 
approximations to true values, and that the rate of convergence, when 
present, is often rather slow. The following section describes a somewhat 
more elaborate method, which usually has better convergence properties. 

Use of (10.12.11) shows that the relation (10.12.2) is equivalent to the 
relation 


f(z) = (@* + pe + g(a? + bet? + - - - + dns) 


oo baal? + p*x + q*). (10.12.16) 


Thus it follows that if, in dividing f(x) by the trial factor x? + px + 
the steps in the division are terminated when the remainder is quadrat 
the new Lin trial factor x? + p*x + q* can be obtained by dividing the 
remainder by its leading coefficient. For this reason, Aitken [27] ref 
to the new Lin trial factor as the reduced penultimate remainder. 

It may be mentioned that Lin also suggested a modification of his t 
nique, in which the new value q* is calculated first, from the second r 
tion of (10.12.10), after which g* is used in place of q, in the first rela 
of (10.12.10), for the calculation of an ‘‘improved” value of p. 
equivalent procedure consists in first calculating Ap and Ag by the pree 
ing method, and then calculating a modified Ap from the formula 


Dns 


(AP) mod =A Dance 








(10.12.1 





In some cases this modification affords improved convergence; in ot 
(including the present example) the reverse is true. 

10.13. Bairstow Iteration. Another iterative method for solving 
braic equations, apparently first devised by Bairstow, but rediscovered 
Hitchcock and others, differs from the Lin method in that the equati 


R(p,q) = 9, S(p,q) = 0 


are solved by Newton-Raphson iteration, rather than by the method 
successive substitutions used by Lin, so that it is a second-order pr 
In--virtue of the relations (10.12.7) and (10.12.8), we have 
R 7 bn, 


S = dn + pon, (10.18 
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and hence the Newton-Raphson recurrence relations (10.8.25) become 











Obn-1 ee 
ap Ap + Aq “- Dn-1 = 0 
and 
Obn act Od OOn-1 
(# + p = + Oe) ap + (% + p aq ag + bn + pb = 0, 
where Ap = p* — p and Aq = q* — q. If the second relation is sim- 


plified, by subtracting from it p times the first equation, the two relations 
become 


ae a aa eae Fy ot Ag + br = 0, 
3 (10.13.3) 
(@: i = 0. 


If we recall that the b’s are defined in terms of the coefficients of f(x) 
by the recurrence formula (10.12.5), 


1 ay ara (2) 


bone _ 
b= Oe — Poe-s a Si (10.13.4) 


a bo 1, 
it remains only to determine the vr derivatives involved in (10.13.3). 
For this purpose, we obtain from the relation (10.13.4) the additional 
relations 


obi. Ob;-1 

















—->—=bhitp +4 (k=1,2,. n) 
0 ’ ’ ) 
P oP ae (10.13.5) 
Ob_, _ Abo 0 
ap ap 
and — yp pM pg MHP = 1,2,...,n) 
og q (10.13.6) 
ae 
aq 7 0g 
Hence, if we introduce a new recurrence formula 
Ce = De — PCe1 — YCr—2 (k = 1,2 ,n—1) (10.18.7) 
c1=0, ce = 1, 
it follows, from (10.13.5), that 
ab 
- =-q1 (k=1,2,...,n) (10.13.8) 
Pp 
and, from (10.13.6), that 
ob 
— = = Chan (k = De 2, see ,n), (10.18.9) 


8q 
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where the c’s are obtained from the b’s just as the b’s are obtained from 
the a’s. 
Thus the first n — 4 of the c’s are the coefficients in the relation 


gn + bia + me a + Dn—3& + Dn» 
= (a? + pu + g)(a* + ce +» » H+ Cnst + na) + Re t+ 8! 
(10.13.10) 
and also R' = ¢n-s, S!’ = Cn-2 + PCn—s. (10.13.11) 





In particular, we have 
DBwat BUret apts OB 10.13.12 
ap = —Cn-2, “aq” = Cn—3) aq Cn—2, ( -1o. ) | 


so that three of the four desired coefficients in (10.13.38) are now iden- 
tified, and are calculable from (10.13.7). When k = n, Kq. (10.13.8) 
gives 

Obn 


a = ent, 


op 


and hence the remaining coefficient in (10.13.3) is given by 


(10.13.13) 


ODn : 
ap + dnt = —En-1, 


where, in accordance with (10.13.7), 
(10.13.1 


The basic equations of the Bairstow iteration then take the simple fo 


Cn—1 = Cn—1 — bra = — PCn—2 — Gln—3- 


Cn-2 Ap + Cn—s Aq = bar, 
€n—1 Ap + Cn—2 Aq = Dn, 


and the principal calculation involved in an iteration can be arranged | 
follows: 


(10.13.11 


1 1 1 
a1 bi C1 


An—4 bn—4 Cn—4 
Qn—3 | brs Cn—-3 
an-2 bn—2 Cn—2 
GQn—1 | bn. Cn—1 
Qn On 








Here each element in the 6 column (including b,), and each element 
the c column except the last one (En—1), is calculated as in the Lin itera 
as the result of subtracting from the element to its left p times the |} 
calculated element above it and q times the next-to-last element above 
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The element €,_; is calculated in the same way except that the element to 
its left is imagined to be replaced by zero. 

In addition, it is necessary to solve the simultaneous linear equations 
(10.13.16) for the corrections to be added to p and q to give p* and q*. 
For this purpose, the quantities 


D= C2» = En—1C€n—3 (10.13.17) 
and Dy = bn—iCn—2 — OnCn—s, Dg = —bdn-1€n-1 + Onen—2g (10.13.18) 
may be tabulated, after which there follows 
D D 
Ap = ak Aq = Dp" (10.13.19) 


The first three stages of the result of applying the Bairstow iteration to 
the equation (10.12.15), again starting with p = q = 0, appear as follows: 

















Pd = 0, Ot =i). 43.148) 1.9 | —1.998, 1.998 
1 1 1 1 1 1 1 
~8 ~s| -s| -6.7| -5.4| -6.10 | —4.20 
39 39/ 39] 29.0] 20.7] 25.51 | 15.63 
—62 —62 0| —15.6 | 33.9 | -1.94] 37.7 
50 50 —~8.0 2.16 
D-= 1521 612 403 
Dy, Dz = | —2018, 1950} —366, 363 —39.4, 39.4 
Ap, Aq= |—1.3, 1.3] —0.6,0.6 —0.098, 0.098 











The next (fourth) iteration gives p ~ —1.9999992 and gq =~ 1.9999992, 
if sufficiently many digits are retained in the calculation. 

A comparison of these results with those obtained in the preceding 
section illustrates the fact that, whereas the Bairstow iteration may 
converge more slowly than the Lin iteration in the early stages, when both 
iterations converge, its ultimate rate of convergence in such cases is far 
superior. This is due to the fact that it is a second-order process, 
whereas the Lin iteration is a first-order process. 

Furthermore, the Bairstow iteration will converge if the starting values 
of p and q are sufficiently close to true values, whereas in the Lin iteration 
this is not always the case. On the other hand, the Bairstow iteration 
appears to be somewhat more sensitive to the choice of starting values 
than does the Lin iteration, in the sense that, if the Lin iteration 7s 
asymptotically stable at (a:,a2), it may converge with starting values 
which correspond to cruder approximations to (a,a2) than are required 
for convergence of the Bairstow iteration. 

Various modifications of both the Lin and Bairstow procedures are 
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possible. A somewhat similar method, due to Friedman [89], involves 
two separate synthetic divisions per iteration, so that the labor involved 
is comparable with that of the Bairstow procedure. However, it is not 
a second-order process, so that its convergence properties are more nearly 
comparable to those of the Lin procedure. Ree 

It is possible to show that, when the Friedman iteration is used for 
extraction of a linear factor, the first step in an iteration 1s identical with 
the Lin iteration which determines z* from the starting value z by use of 
(10.9.12). In place of then recalculating all the b’s and R, with z replaced 
by 2*, the Friedman procedure next retains the values of bo, b1,-- +5 
bn—2, and recalculates only b,-1 and R in this way. If these new values . 
are designated by byz_, and R*, the cycle is concluded by again using the 
Lin formula (10.9.12) to calculate 2** = 2* — (R*/b%_,), to serve as the 
starting value in the next two-step cycle. ; 

In the case of a quadratic factor, the two steps in a Friedman iteration 
are equivalent to first calculating p* and q* from the starting values p and 
q by the modified Lin procedure, based on the use of (10.12.17), and then 
retaining the values of bo, Dig. ek Sag Cad recalculating only bn—2y R, and 
8 with p and q replaced by p* and q*, and applying the unmodified Lin 

ulas to them. 

ai aie cases, the Friedman procedure is superior to the Lin method, 
in others inferior. Either may converge when the other does not. 

10.14. Supplementary References. A comprehensive bibliography of 
known methods for solving sets of linear algebraic equations has been 
compiled by Forsythe [80] and is included in [176]. See also Taussk 
[221], Bodewig [48], and Forsythe [81]. A few papers treating special 
methods are listed in the present bibliography. Householder [3] gives 
concise treatments of several of the modern methods. Error analyse 
are included in Hotelling [115, 116], Bargmann, Montgomery, ane 
von Neumann [30], von Neumann and Goldstine [166], Lonseth [138 
Redheffer [190], Fadeeva [74], Dwyer [72], Goldstine and von Neuman 
[95], and de la Garza [91]. Certain problems in linear programming, ant 
in other fields, involve the treatment of sets of simultaneous linear mmequa 
ities. Some available methods are outlined in Charnes, Cooper, a do 
Henderson [53]. Turing [227] discusses measures of the “condition of 
a matrix. Schréder [210] appears to have initiated the classification with 
respect to ‘“‘order” of iterative methods for solving nonlinear equations, 
More recent studies of general properties of such methods include those of 
Hamilton [101] and Bodewig [49]. The Graeffe procedure is analyzed by 
Bodewig [47], Hoel and Wall [114], and Ostrowski [174]; the Bernoulli 
iteration by Aitken [23]. The method of Lin [135, 136] is studied b; . 
Aitken [26] and is generalized by Luke and Ufford [142] to the extractic 
of factors of degree greater than two. See also Aitken [27]. For 
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application of Newton-Raphson iteration to the determination of quad- 
ratic factors of polynomials, see Bairstow [28] and Hitchcock [112]. 


PROBLEMS 
Section 10.2 


1. Solve the following set of equations by use of determinants, without introducing 
round-offs: 


1.42, + 2.322 + 3.723 = 6.5, 
3.321 + 1.6%. + 4.323 = 10.3, 
2.521 + 1.922 + 4.123 = 8.8. 


2. Determine D times the inverse of the coefficient matrix in Prob. 1 without intro- 
ducing round-offs, where D = —0.249 is the determinant of that matrix. Then use 
this matrix to obtain explicit expressions for Dzi, Dz2, and Dz; when the respective 
right-hand members are replaced by ci, cs, and cs, and check the results when the c’s 
are assigned the values given. Also use this result to investigate the significance of 
the solution if it is supposed that the given coefficients are exact, but that the given 
right-hand members are only rounded numbers. 

3. Show that the equations 

way + 322 +23 = 5, 
221 — X2 + Qwxs3 = 3, 
a1 + 4x2 + wry =6 


possess a unique solution when w # +1, that no solution exists when w = —1, and 
that infinitely many solutions exist when # = 1. Also, investigate the corresponding 
situation when the right-hand members are replaced by zeros. 


Section 10.3 


4. By considering the result of increasing each x by unity in each equation of 
(10.2.1), establish the validity of the following error check: 

Lf to each equation is adjoined an entry representing the sum of the coefficients and the 
right-hand member of that equation, and if the column of those entries is transformed under 
the Gauss (or Gauss-Jordan) reduction in the same way as the column of right-hand 
members, then, at each succeeding step, the transformed entry associated with any trans- 
formed equation will equal the sum of the coefficients and the right-hand member of that 
equation, except for the effects of intermediate round-offs or gross errors. 

5. Solve the set of equations in Prob. 1 by the Gauss reduction, retaining only five 
decimal places in the intermediate calculation, and using the error check of Prob. 4. 

6. Proceed as in Prob. 5 with the following set of equations: 


8.46721 + 5.13722 + 3.14123 + 2.0632, = 29.912, 
5.13721 + 6.42122 + 2.61723 + 2.0032, = 25.058, 
3.1412, + 2.61722 + 4.12823 + 1.6282, = 16.557, 
2.06321 + 2.00322 + 1.62823 + 3.4462, = 12.690. 


7. Repeat the calculation of Prob. 6, using the Gauss-Jordan reduction. 


Section 10.4 


8, 9. Proceed as in Probs, 5 and 6, using the Crout reduction. 
10, 11. Assuming tho given data to be exact, and starting with the approximate 
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solutions of Probs. 8 and 9, obtain the solutions of those problems with 10-place 
accuracy by use of the Crout reduction. 

12. Solve the following set of equations, as it stands, by the Crout reduction, retain- 
ing only five decimal places in the intermediate calculation. Then repeat the calcula- 
tion after an appropriate reordering of equations and renumbering of variables, using 
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by use of the Gauss-Seidel iteration: 


3.95521 — 1.01322 = 0.3068, 

—1.00721 + 3.92622 — 1.02323 = 0.8669, 
—1.013z22 + 3.88723 — 1.0382, = 1.3168, 
—1.021z3 + 3.8412, = 2.7997. 


the computational-error check in both cases: 


1.4232, + 2.31622 + 3.21823 = 8.553, 
2.3162, + 3.75laz + 1.24423 = 7.342, 
3.2187, + 1.24422 + 6.17323 = 13.349. 


Section 10.5 

13, 14. Determine the inverse of the coefficient matrix in Probs. 8 and 9 by the 
Crout reduction, retaining five decimal places. Also evaluate the determinant of the 
coefficient matrix in each case. 


Section 10.6 

15, 16. Use the results of Probs. 13 and 14 to obtain approximate upper bounds on 
the inherent errors relevant to the solutions of Probs. 8 and 9, assuming (a) that the 
coefficients are exact and the errors in the right-hand members cannot exceed e¢ in 
magnitude and (b) that the coefficients as well as the right-hand members may be in 
error by as muchas te. In each case, determine what can be said about the solution 


if the errors in the given data are due to round-off. 
17, 18. Reestimate the error bounds considered in Probs. 15 and 16 by use of the 


inherent-error check column. 
19. If zi, . . . , an satisfy the equations 
n 
inte = C G@=1,2,...,%); 
k=1 


show that there follows 


n 
i Oxk _ et Gi #7r), 
** cr 1 G=n, 
k=1 
ae ), O88 Ban { —m (=n) 
k=1 
and deduce the relations 
a 
Oc, = Ark Oars — Ayxts. 


20. Use the results of Prob. 19, and the data of Prob. 13, to obtain approximations — 


to the changes in the values of x1, X2, and 23 in Prob. 1 corresponding (a) to an increase 
of 0.05 in cs; = 8.8 and (6) to a decrease of 0.05 in aos = 4.3. 

21. Use the results of Prob. 19, and the data of Prob. 14, to obtain approximations 
to the changes in the values of «1, 22, #3, and x, in Prob. 6, corresponding (a) to an 
increase of 0.001 in c; = 16.557 and (b) toa decrease of 0.001 in ag3 = 2.617, 


Section 10.7 
22. Determine the solution of the following set of equations, to four decimal p 










23. Investigate (empirically) the efficiency of the Gauss-Seidel iteration in the case 
of the equations in Prob. 1. 

24. Deterinine the solution of the equations in Prob. 22, to four places, by use of a 
relaxation procedure. 


ie a Experiment with the application of relaxation methods to the equations in 
rob. 1. 


Section 10.8 


26. Suppose that the equation x? + az + a2 = O possesses real roots a and 8. 


Show that the iteration 2k41 = — (azz + a2)/z: is stable near x = a if |e| > |s|, the 
iteration zi41 = —a2/(z + a1) is stable near « =a if |a| < |s|, and the iteration 
Zui = — (22 + a2)/ar is stable near x = a if 2 |a| < |a + Bl. 


27. With the notation of Prob. 26, show that the iteration 
Zhg1 = ee — (e + aiee + a2) h(zn) 


is stable near « = aif 0 < (a — B)d¢(a) < 2, that the asymptotic convergence factor 
is p = 1 — (a — 6) (a), and that the three iterations of Prob. 26 are the special cases 
in which $(x2) = 1/z, 1/(x + a1), and 1/a1. 

28. If t=a is a root of f(z) = 0, if successive approximations to a are generated 
by the iteration 241 = F(z:), and if F(x) possesses r + 1 continuous derivatives and 
is such that 


F(a) =a, F(a) =F’(a) = +++ = F(a) =0, 
for some r = 1, show that 


dil iy CT a ere, 


os 


where é lies between z and a, so that the iteration is a process of order r +1 or 
greater. 


29. With the notation of Prob. 28, show that the iteration corresponding to the 
definition 


F(z) =a — di(x)f(x) — gala) [f@)? — ba(a)If@ PF — ++ - 
is at least of second order if 1 — ¢1f’ = 0, at least of third order if also 
2oif’ + dif” + 2G2f? = 0, 
and at least of fourth order if further 
3a f’ + 36,f" + aif” + boo? + bdaf'f’ + 6osf"* = 0, 


under the assumption that the ¢’s and an appropriate number of their derivatives are 
finite at c = a. Thus deduce that the formula 


fe fe (f 2 (fn? fa \ (SaY8 
z = 4 a= = eal - — )*ee 
wen =~ a(R)’ > Gis ~ a) CR) - 
with f{” = f(s), yields a process of order equal to th ined i 
ping A paging cy ¢ number of terms retained in 
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30. Rederive the formula of Prob. 29 by writing 2x41 — 2 = h and 


v0 


, h2 tt h 
fla th) = fi thf, + oft + he a 















assuming an expansion of the form h = —(¢ife + oof; + ofp + - ‘), requiring that 
the coefficients of successive powers of f;, vanish in the result of substituting the second 
expansion into the first, and so obtaining the conditions 1 — dif; = 0, 2¢ef, — oif, = 0, 
6dafy, — Obidohy + otf, = 0, °° -- 

a1. Use the formula of Prob. 29 to approximate the real root of #2 —x —1=0, 
taking zo = 1.3 and calculating separately the approximations to z; afforded by reten- 
tion of one, two, and three correction terms. Also investigate the approximations 
corresponding to the choice zo = 1. ; 

32. The real root « of the equation z + log z =0 lies between 0.56 and 0.57, 
Show that the iteration zi. = — log z is unstable near 2 = a, and verify this fact 
by calculation. Then show that the iteration 2:41 = e~* is stable near x = a, and 
determine a to five places. ; 

33. Suppose that the solution of Prob. 32 is required, but that only values 
logio x are to be used. Determine a convenient value of the constant c for which t 
iteration 

Zep = Ze — C(ze + log ze) = (L — c)zn — c(log 10) logio 2 


is stable near « = a, and use the result to determine a to five places. 
$4, 35. Repeat the determination of Probs. 32 and 33, using the Newton-Rap 
iteration with f(z) = « + log « and with f(z) =a —e™, respectively. | 
86. Show that the Newton-Raphson iterations, as applied to f(x) = a —a@ and to 
f(z) = 1 — (a/z"), for the determination of « = a!/", are of the respective forms 





Ze = AKG —1la+ = | 


get 
n+l 


1 
and Zh = +l + 1)en — a ’ 


and that, if «, = a — 2, there follows approximately 


n—-1, 
41 ~ — on & 
ntl 2 
and €ey. “Oa, €ka 


respectively, when z, ~ a. (Notice that the second iteration formula possesses 
constant denominator and that the two sequences approach a from above and f 
below, respectively, when a > 0.) Also use both iterations to determine (3.4765) 
and (0.049672) to five places. , 

37. Use the results of Prob. 29, with f(z) = 2 — aand with f(z) = 1 — (a/2*), 
obtain third-order iterations leading to a = a? in the forms . 


a 1 a\? 
an = t(a+2 -p(a-2 
2 \? 
. mnin(t=8) #40(0-3): 


and use them to determine (16,824)1 and (0.049672)! to four places, 
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88. By applying the Newton-Raphson procedure to f(z) = 1 — 1/(az), obtain the 
recurrence formula 


2h41 = 2e(2 — azx), 


for the iterative determination of the reciprocal of a without effecting division, and 
show that, if « denotes the error in zx, there follows e,;: = ae; in this case. Also 
obtain the third-order iteration z.41 = 2x(3 — 3az, + azz), and show that the error in 
the calculation of 2,41 due to the first neglected term in the formula of Prob. 29 is then 
approximated by a%e, when z, ~ 1/a. 

39. Suppose that f(x) possesses two zeros a; and a2 which are nearly coincident, so 
that f’(z) vanishes at a point 8 between a; and az. Show that, if 6 is calculated first, 
then initial approximations to the zeros of f(x) are given by 


ones BBP 


if f(8) and f”(8) are of opposite sign, after which improved values may be obtained by 


usual methods. Also, use this procedure to determine to four places the two real 
roots of the equation 





324 + 823 — 62? — 252 +19 = 0. 


40. Determine the smallest root of the equation tan z = cz to five places, with 
c = 1.01, c = 2, and c = 30. 


Determine all real roots of the following equations to five places: 
41. 23 — 2x — 5 = 0.t 

42. «3 — 9x? + 18% — 6 = 0. 

43. x‘ — 162° 4+ 72x? — 96x + 24 = 0. 

44. 24-374 +1=0. 

45. 2? — 3x — 4sin? x = 0. 


46. Determine F (x,y) and G(z,y) such that the Newton-Raphson iteration for a 
solution (a,8) of the equations f(z,y) = 0 and g(z,y) = 0 is expressed in the form 


Thi =F (e,yr), “Yara = E(x, Ye), 


and show that F., F,, Gz, and G, vanish when (z,y) = (a,8) in nonexceptional cases. 
47. Determine to five places the real solution of the equations 


x =sin (x + y), y = cos (x — y). 


48. Determine to five places the real solution of the equations 


4x — 272y? + 25 = 0, 4a4y — 38y3 —1=0 


which lies in the first quadrant. 


49. Determine to five places the real solution of the equations 


sin z sinh y = 0.2, cos « cosh y = 1.2 


which lies nearest the origin. 
{This equation was used by Wallis in 1685 to illustrate the Newton-Raphson 


method and has been included as an example in moat subsequent works dealing with 
the numerical solution of equations, 
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Section 10.9 
50. If a, is a root of the equation f(z) = 2" + aia” ++ banit + an = 0, 


and if the coefficients are inexact, such that the relative error in each coefficient does 
not exceed 7 in magnitude, show that the corresponding maximum error in ar is given 
approximately by 





Jaro" | + Jara? ?| + + 2 + + lan-arrl + lanl 
|6er|max ~ Ny 


If’ (ar) 
if 7 is small. 
51. If a is an approximation to a root a of the equation f(x) = 0, and if f(@) =« 
show that a — @ = —e/f’(£), where £ is between &@ and a, if f(z) is continuous. 


52. Show that the result of replacing x by ¢ + ¢ in 
f(c) =a" fare + +++ + Ont + an 
is of the form f(t) = & + R&D + ReEDVF ss + R't + R, where the coeffi- 


cients can be determined by continuing the process leading to (10.9.5) until it termi- 
nates, with a replaced by c. Also illustrate this procedure in the case when 


f(z) =2-—2—-1 


and c = 1.3, showing that the calculations may be arranged as follows: 


0 1.3 2.6 3.9 
=1 0.69 4.07 
—1 | —0.103 


1 1 1 1 1 





53. Apply the Lin iteration to the equation 
8 + 3.912 + 4.07¢ — 0.103, 


obtained in Prob. 52, starting with t = 0. Thus obtain the real root of the equati 
x? — x — 1 = Oto five places. 

54. The equation of Prob. 43, 24 — 16x* + 7222 — 962 + 24 = 0, possesses roo 
approximated by 0.3, 1.7, 4.5, and 9.4. Use (10.9.15) or (10.9.16) to predict that t 
Lin iteration will be stable only for the smallest root, with a convergence factor 
about 3, and determine that root to five places by Lin iteration. 

55. Use the result of Prob. 29 to devise a third-order iteration process exten: 


(10.9.10), of the form 
Fe ae (7) 
eR ROARED, 


where R” = dp_2and dy = cx + zdk-1. Also determine the real zero of (10.8.6) to aix 
decimal places by this method, starting with z = 1.3. 

5G to 58. Determine the real roots of the equations in Probs. 41, 42, and 44 to five 
places, by either of the methods of §10.9. 

59 to 61. Determine approximately the maximum inherent error in each 
obtained in Probs. 56 to 58, assuming (a) that each coefficient in the given equa 
(except the leading one) may be in error by +0.1 and (b) that each such coefficiont 
correct within 1 per cent (see Prob. 50), 
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62. With the notation of §10.9, show that 


f(@) = bale) = 2 + ayer + + + + + ane + an, 
f' (2) = Cnalz) = 271 + dilzjer? + + + + da_o(z)e + Bn_1(2) 
1 wt — 2 
and al @) = rt + ere)ar§ + + + + + cns(z)z + en-a(z). 


Also deduce, in particular, that the iteration 


R 
2* = 2 — —____ 
bai + 2bn—2 

may be expected to yield results intermediate between those given by the Lin and 
Newton-Raphson iterations, on the average, when the desired root is small in magni- 
tude, and show that its asymptotic convergence factor, near a root a,, is given by 


1+ 550) + auf 0) 
Section 10.10 


63. Determine the largest root in Prob. 42 to four places by Bernoulli iteration 
64. Determine the largest root in Prob. 43 to four places by Bernoulli iexotion 
Also, after replacing x by 1/z, determine the smallest root in a similar wa Th 
determine all roots to four places. " x 
65. Show that the Bernoulli iteration converges very slowly when applied to Prob 
re merges ae this fact. Then translate the origin to a convenient point near 
e root, replace x by 1/z, and apply the iteration to d i i 
bedise scented etermine the reciprocal of the 
66. Show that the u, s, and ¢ se i i 
quences all behave unsatisfactorily wh 
& that , y when the Ber- 
noulli iteration is applied to Prob. 44. Then replace x by 1/z, use the iteration - 
determine the smallest root to four places, and determine the other real root after 
translating the origin to a nearby point and replacing x by 1/z. Finally, determine 
the remaining roots and account for the original difficulty. 
67. Apply the Bernoulli iteration to the equation 


a4 — 825 + 392? — 62c + 50 = 0, 
determining the larger pair of complex roots to three significant figures 


Section 10.11 


68 to 71. Determine all roots of the e i i 
quations in Probs. 41 i 
places, by the Graeffe procedure. ; tae ieee 


72. Use the Gr aeffe procedure to deter mine all roots 0 the eq tio 7 
t f 
uation of Prob. 67 to 


Section 10.12 
73. Show that the equation 
az* — 9.00a' + 29.08%? — 39.522 + 18.82 = 0 
has roots near # = | and @ = 2 and two roots near # = 3, and determine the roots to 


four decimal pl b 
starting with npr extracting an approximate quadratic factor by Lin iteration, 
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74. Determine all roots of the equation 


vt + 9x3 + 362? + 5lz + 27 = 0 



















to four decimal places by iterative Lin extraction of a quadratic factor. 
75. With the notation of §10.12, show that the Lin iteration (10.12.11) can be 
written in the form 


qk gS 


LP ae ve - 2" 8 
Ray +58 = f(x1), R22 +8 = f(%2), 


where 2, and x2 are the zeros of x? + px + q, and hence deduce the relations 








and that 


ai(p* — p) + q* — 9) = beeen ax(p* —p) + @*-) =7 6 


Then show that these relations can be written in the forms 


cr ant — 2) = LED, — ot — sitet 20, 


(os — a) (a — 2s) = — BEY + Cot - eyed - 20), 


where x* and x} are the zeros of 2? + p*c + q*, and deduce that, when (21,22) 
(c¥,x%) are near (a1,c2), there follows 


a—-o Y [1 + a fie) (a1 — 21) = pilar — 21), 


ag — Qi 
la 
aes [1 55 ates fee oie. a) ime paler 9); 
ag — a) An 


where pi and p2 are the convergence factors listed in (10.12.12). Thus show that, 
the zeros 2; and 22 of x? + px + q approximate two zeros a1 and as of f(z), and if 
and x* are the zeros of x? + p*x + q*, then x* is generally a poorer approximation 
a than 2 unless |p:| < 1 and 2} a poorer approximation to a2 than x2 unless |pal g 

76. In the case of a quartic equation, show that the asymptotic convergence fac 
relevant to the root pair a1, a: are 


a a 
pr = —— (as + a — a1), po = —~ (as + a1 — a2), 
4 asa 


a3. 
where a; and a, are the remaining roots. Show, in particular, that the Lin itera 
should converge rapidly to the root pair near x = 3 in Prob. 73, but that converg; 
to the pair near x = 1 and « = 2should be very slow, and verify the last fact by 
calculation. 

77. Determine the first five quadratics yielded by the Lin iteration as applied to 
equation x4 — 42° + 7x? — 16x + 12 = 0, starting with p = q = 0, and show 
one of the zeros of the sequence of quadratics tends to approximate the smallest, 
(x = 1) of the given equation. Also, use the fact that the four zeros are x = 1, 3, 
+2i to show that this situation is in accordance with the results of Prob. 75. 


Section 10.13 


78, 79. Repeat the determinations of Probs. 73 and 74, using the Bairstow i 
80. Show that, if the Newton-Raphson iteration is applied to the equations bana 
and b, = 0, rather than to the equivalent equations by. = 0 and by + Pbnaa 
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then the Bairstow procedure is modified only to the extent that Z,_1 is replaced by 
Cn—1 in (10.13.16), so that all elements in the c column then are to be calculated from 
Saeee _ . bcolumn by the same rule. Also, apply this procedure to the example 
in the text, showing th: i i i 

ena i wing that the modification leads to somewhat slower convergence in 


81. With the notation of (10.12.2) and (10.13.11), sh i i 
12. 13. ow that the B: i 
can be described by the equations Soren e 


(S’ — pR’) Ap + R’ Aq = R, 
—qR' Ap + S’ Aq = S. 


(These are the forms originally given by Bairstow.) 
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JUSTIFICATION OF THE CROUT REDUCTION 


The Gauss reduction of §10.3 reduces the set of equations 


Gy1%1 + Ay2%2 + Aigt3 + + + + + Aintn = C1, 
Goit1 + AoaXe + Me3%3 + + + + + Ganka = C2, (Al) 


GniX1 + AneX, + Angtg + + + + + AnnXn = Cn 
to an equivalent set of the form 


a1 + ahete + aigts + ++ + $.45,2n = Ch, 
Lo + Aggt3 + + + + + Adnan = C4, 


eres Ce eR 0] 0 eo 8 Ree ee se: 


(A2) 


after which the required values of vn, 2,1, . . . , 1 are obtained simp 
by solving the equations (A2) successively, in reverse order. 

Since, in the Gauss reduction, the kth equation of (A2) is obtained by 
sequence of operations which involve the subtraction of multiples of 
first k — 1 equations of (A2) from the kth equation of (A1) and the di 
sion of the result by a constant, it follows also that the kth equation 
(A1) can be expressed as a linear combination of the first k equations 
(A2), so that a set of constants aj, exists, with 7 2 j, such that 


a4 4C} = C1, 

oiC, + Age = C2, 

GnrCy + Ang + ngcy + > + + + nal = Cu 

The Crout reduction amounts to first determining the coefficients aj, 

in such a way that the elimination of cj, . . . , c), between (A2) and (AS 

leads to (A1), then determining ci, . . . , c}, from (A38), and finally resoly« 

ing (A2) fora, . . . , tn by the ‘‘back solution” of the Gauss proced 

In order to simplify the derivation of formulas for the determination 

the coefficients aj; involved in both (A2) and (A8), it is convenient 
486 
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introduce the temporary notations 


3 ae {2% (2 9), 


0 [27 
0 -G<p, Per | Gey Ae 


ai; @ <j). 
The three sets (Al) to (A3) can then be specified by the equations 


» Qj; = Ci, (A1’) 

j=l 
Le + > Bust = Ch, (A2’) 

j=l 
and OKC, = Ci, (A3’) 

where all indices range from 1 to n. 

The introduction of (A2’) into (A3’) then gives 
KE + KBr; ) Xj = Ci, 
pees et 
and this relation is equivalent to (A1’) if 

ay + > AiKBag = Aj. (A5) 


k=1 


In virtue of (A4), the first term a; is zero unless i = 7 and the summand 

in the second term vanishes unless both k < i and k < j. Thus, when 
t 2 j, (A5) becomes 

j-1 

ai, + > 


k=1 


PI a = 
Ay Apg = Az 


23), (A6) 


whereas, when 7 < j, it can be written in the form 


t—-1 


ajay +) analy = ay (6 <9). (A7) 
k=1 
These relations, together with the relations 
t-1 
acs +) alge = cg (A8) 


kewl 
n 


and Xu + 


Ate ™ Ch, 
het 


(A9) 
























































































488 INTRODUCTION TO NUMERICAL ANALYSIS 


which are equivalent to (A3’) and (At), respectively, are identical with 
the relations (10.4.4) to (10.4.7) of §10.4, establishing the validity of the 
Crout reduction as described in that section. 

Clearly, the compactness of the relevant tabulation follows from the 
fact that, after suppressing the diagonal 1’s in the coefficient matrix of 
(A2) and the right-hand members of (A3), which are also contained in the 
matrix of (A1), the remaining elements of the two matrices associated with — 
(A2) and (A3) can be recorded in a single auxiliary matrix. 

In order to establish the relation 


aja, =a, (<j) (A10) 
in the special cases when the given coefficient array is symmetric, so that 
ays = Ais (All) 


we may verify that (A6) and (A7) imply the relation 


i-1 
i : 
alal, — Oj, = Gig — Aj 1 y (Qi; = ji) (i <j) 
k=l 


wig "5 


which can be written in the form 


i-1 
alal, — af = ) 

k=1 
if (A11) is true. When 7 = 1, the sum on the right is absent, so 
(A10) is established in that case. When i > 1, (A12) expresses ajaly 
a}, as a linear combination of terms of the form aj,a;, — @,,, Where r 
and r < s, so that (A10) is established by induction on 1. 

The validity of the calculational-error check described in §10.4 foll 
from the fact that an increase of each solution element 2; by unity w 
correspond to an increase of c; by » ax, according to (A1’), and to 

k=1 


[at,(atza}; — af.) — ab (Aa — e)] (<5), (Al 


a’,, according to (A2’). 

k=it+1 
Since the kth equation of (A2) can be obtained by subtracting from 
kth equation of (A1) a certain linear combination of the first k — 1 eq 
tions of (A1), and dividing the result by a},, it follows, from the elem 
tary properties of determinants, that the determinant of any sq 
array of order k formed from elements in the first k rows of the augmen 
matrix of (A1) is given by the result of multiplying the determinant of 
corresponding array in (A2) by ajay + + + Gy. In particular, we 


increase of c; by 1 + 
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obtain the useful results 














es aude Qi Ar ; G11 G2 Ais 
1 = G1, = 141A, Qo. Ae a = a'.a'a'.: 
G21 A22 23 11490033, 
Q31 Aso. 33 
eu Qin 
Ce es re ee ee eae ee as. Al awl 
’ = G10. +++ a, (A13) 
Ant Ann 








When the matrix composed of the coefficients in (A1) is symmetric, it 
is said to be also positive definite if and only if each of the n princely al 
minors indicated in (A13) is positive. It follows that this ages 
positive definite if and only if all the diagonal elements of the associat 
Crout auxiliary matrix are positive. ° 
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DIRECTORY OF METHODS 


A. Interpolation 
1. Based on polynomials 
a. Using an arbitrary set of ordinates, without differences 
(1) Noniterative: §3.2 
(2) Iterative: §2.7 
b. Using differences formed from ordinates at equally spaced points 
(1) Near beginning or end of tabulation: §4.3 
(2) Inside tabular range 
(a) Using both odd and even differences: §§4.5, 4.6 
(b) Using only even differences or only odd differences: 
§4.7 
(3) With throwback: §4.10 
c. Using divided differences formed from an arbitrary set of 0: 
nates: §2.5 
. Using ordinates and slopes: §8.2 
. Using ordinates at appropriately selected points: §§9.6, 9.7 — 
. In two-way tables: Probs. 55 to 57 of Chap. 5 
. Inverse: §2.8 
2. Based on ratios of polynomials: §§9.9 to 9.12 
3. Based on sines and/or cosines: Prob. 7 of Chap. 3; see also B.4 
4. Based on exponential functions: see B.5{ 
B. Approximation 
1. By polynomials 
a. Determined as truncated Taylor expansions: §$§1.3, 1.7 
b. Determined by exact fit over a discrete set of points: see A.l 
c. Determined by least-squares methods 
(1) Using an arbitrary finite set of ordinates: §7.3 
(2) Using ordinates at equally spaced points: §7.11 
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+ The approximating functions obtained by least squares incorporating a num 
of ordinates equal to the number of independent coordinate functions fit the 
exactly at those points, and thus are strictly interpolative functions, 
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(3) Using a continuous range of ordinates 
(a) Over a finite interval: §§7.5, 7.8; Prob. 31 of Chap. 7 
(b) Over a semi-infinite interval: §7.6 
(c) Over an infinite interval: §7.7 
d. Economized by use of Chebyshev polynomials: §9.8 
2. By products of exponential functions and polynomials: §§7.6, 7.7 
3. By ratios of polynomials: see A.2 
4. By sines and/or cosines 
a. With prescribed periods 
(1) Using a finite set of ordinates: §§9.3, 9.7; Prob. 31 of Chap. 7 
(2) Using a continuous range of ordinates: §9.2 
b. With periods to be determined: §9.5 
5. By exponential functions: §9.4 


. Numerical Differentiation 


1. Using ordinates without differences: §§3.3, 3.8 
2. Using differences formed from ordinates at equally spaced points 
a. Near beginning or end of tabulation: §5.3; Prob. 5 of Chap. 4 
b. Inside tabular range 
(1) Near tabular point: §5.3; Prob. 11 of Chap. 4 
(2) Near mid-point between tabular entries: §5.3; Prob. 13 of 
Chap. 4 


. Numerical Integration 


1. Using ordinates at equally spaced points, without differences: 
§§3.5, 5.10 
2. Using differences based on ordinates at equally spaced points 
a. Over range near beginning or end of tabulation: §5.4; Prob. 5 
of Chap. 4 
b. Over range centered at interior tabular point: §5.6; Prob. 11 
of Chap. 4 
c. Over range centered midway between interior tabular points: 
Prob. 13 of Chap. 4 
3. Using calculated ordinates at appropriately selected points 


a. Integrals of the form [’ f(a) dx: §8.5 

b. Integrals of the form i ef (2) dz: §8.6 

c. Integrals of the form if : (a — a)Pe-e*f(x) dx: §8.6 
d. Integrals of the form | ° : ef (x) da: §8.7 


e. Integrals of the form {" f(x) da/+/a? — x: §8.8; Prob. 37 of 
Chap. 8 
J. Integrals of the form re * S@) Va! — 2° de: Prob, 24 of Chap. 8 
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g. Integrals of the form i : (x — a)*(b — x)*f(x) dx: §8.9 
h. Integration formulas involving ordinates at one or both of the 
integration limits: §§8.11, 8.12; Probs. 28 and 37 of Chap. 8 
i. Integration formulas employing equal weights: §8.13 
j. Algebraic derivation of miscellaneous formulas: §8.14 
4. Using ordinates and slopes: §8.3 (see also §6.14) 
5. Repeated: §§5.5, 5.6; Prob. 10 of Chap. 5 
6. Two-way: Probe, 53 and 54 of Chap. 5 
. Summation of Series 
1. Finite sums of polynomials: §$§5.8, 7.9 
2. Approximate summation of series: §§5.8, 5.9; Probs. 7 of Chap. 1, 
25 of Chap. 5 
a. Terms of constant sign: §5.9 [Eqs. (5.9.4) to (5.9.6)] 
b. Terms of alternating signs §5.9 [Eqs. (5.9.9) to (5.9.11)] 
3. Euler summation of divergent series: §5.9 
. Smoothing of Data 
1. By determining smooth approximating function: §7.11 (see also 
B.1e and B.4) 
2. By point-by-point modification of data: §7.13 
. Numerical Solution of Ordinary Differential Equations 
1. Initial-value problems: see §6.19 
2. Boundary-value problems: §6.17 
3. Characteristic-value problems: §6.18 
. Numerical Solution of Equations 
1. Sets of linear algebraic equations 
a. By use of determinants: $10.2 
b. By a sequence of reductions: §§10.3, 10.4 
c. By iteration: $10.7 
2. Nonlinear equations 
a. General iterative methods: §10.8 (see also §§2.8, 9.12) 
b. Special iterative methods for algebraic equations 
(1) Approximate determination of largest or smallest 
§10.10 
(2) Simultaneous approximate determination of all roots 
root squaring: §10.11 
(3) Improvement of approximation by iterated synthetic divie 
sion: §10.9 
(4) Simultaneous approximate determination of two root# 
§§10.12, 10.13 


1. Inversion of power series: Probs. 15 and 39 of Chap. 1. 
2. Expansion of one function in powers of another; §1.7 
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. Checking tables by use of differences: $4.9 

. Expression of differences in terms of derivatives: §5.3 

. Subtabulation: §5.7; Probs. 16 and 17 of Chap. 5 

. Calculation of mean values over given intervals from mean values 


over other intervals: Prob. 18 of Chap. 5 


. Determination of unknown periodicities from empirical data: §9.5 
. Continued-fraction expansions: §§9.9 to 9.12 

. Evaluation of determinants: §10.4 

. Inversion of matrices: §10.5 
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INDEX 


Boldface figures in parentheses refer to problem numbers 


Adams method, 198, 216 
modified, 201 
Aitken’s 6? process, 445 
Aitken’s iterative interpolation, 49 
Approximant, 396 
Approximation, 2 
methods of, 502-503 
Asymptotic convergence factor, 444 
Asymptotic series, 8 
Asymptotic stability, 444 
Augmented matrix, 424 
Averaging operator, 128 


Backward differences, 92 
Bairstow’s iteration, 472 
Bernoulli numbers, 150 

Bernoulli polynomials, 177(19) 
Bernoulli’s iteration, 458 

Bessel’s interpolation formula, 101 
Binomial coefficients, 66, 283 
Birge-Vieta iteration, 453n. 
Boundary-value problems, 239 
Bitrmann series, 25 


Central differences, 93 
mean, 100 

Cesiro summation, 160 

Characteristic-value problems, 243 

Chebyshev approximation, 279 

Chebyshev interpolation, 389 

Chebyshev polynomials, 280, 290, 
308 (81), 349, 388, 417(26) 

Chebyshev quadrature, 345 

Chebyshev-Gauss quadrature, 330 

Checking tables, 111 

Choleaky's reduction, 420n. 

Chopping, 8n. 


Christoffel-Darboux identity, 322 
Coefficient matrix, 424 
Cofactor, 425 
reduced, 426 
Continued-fraction approximation, 395 
convergents of, 396, 402 
by Thiele’s expansions, 406 
Convergence of series, tests for, 
27 (5, 6), 28(9) 
Convergence factor, 213, 444 
asymptotic, 444 
Convergents of continued fractions, 
396, 402 
Cotes integration formulas (Newton- 
Cotes), 71, 79 
Cramer’s rule, 426 
Critical tables, 59(41) 
Crout’s reduction, 429, 486 
check columns for, 432, 438, 488 
Cubature, 121, 186(58, 54) 


Deferred approach to the limit, 78n., 
238n. 
Degree of precision, 162, 318 
Delta operator, 130 
Derivative, inverse, 409 
reciprocal, 409 
Determinants, evaluation of, 432 
Jacobian, 451 
Vandermonde’s, 85(5) 
Difference equations, 203, 458 
Difference operator, 128 
Differences, backward, 92 
central, 93 
divided, 36, 38, 395 
forward, 91 
inverted, 897 
moan central, 100 
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Differences, modified for throwback, 113 
reciprocal, 406 
Differential equations, numerical solu- 
tion of, 188-249 

boundary-value problems, 239 
characteristic-value problems, 243 
selection of method for, 247-249 

Differential operator, 129 

Differentiation, numerical, 64, 82, 
122(5), 123(11, 18), 134, 503 

Distribution, normal, 17 

Divided differences, 36, 38, 395 


Economization of power series, 391 
Eigenvalue, 243 
Equations, normal, 260 
solution of, 424-477, 504 
(See also Differential equations; 
Linear algebraic equations) 
Error formulas, in integration, 161 
G method, 164 
Q method, 170 
V method, 168 
in interpolation, 45 
Error function, 18, 28(11), 29(12, 18) 
Errors, bounds on, 14 
classification of, 4 
detection of, 111 
gross, 4 
inherent, 4, 436, 438 
probable, 20 
random, 17 
relative, 9 
round-off, 4 
truncation, 5 
Euler sum of divergent series, 157 
Euler’s constant, 180(26) 
Euler’s method, 198 
modified, 201 
Euler’s sum formula, first, 151 
second, 154 
Euler’s transformation, 157 
modified, 158 
Euler-Maclaurin sum formula, 151, 229 
Everett’s interpolation formulas, 103 
Exponential approximation, 378 
Extrapolation, Aitken’s, 445 
Richardson’s, 78n., 2388n. 


Factorial, Stirling’s approximation to, 
80, 181(27) 


Factorial power functions, 282 

False position, method of, 446 

Filon quadrature, 84 

First law of the mean, 22 

Forward differences, 91 

Fourier approximation, 369, 373 
Frequency function, 17, 31(23) 
Friedman’s iteration, 476 
Fundamental theorem of algebra, 26 


Gauss quadrature, 319 
Gauss’s interpolation formulas, 97 
trigonometric, 85(7) 
Gauss’s reduction, 428 
check column for, 477(4) 
Gauss’s sum formula, 156 
Gauss-Jordan reduction, 429 
check column for, 477(4) 
Gauss-Seidel iteration, 439 
Gill’s method, 238 
Graeffe’s iteration, 462 
Gram approximation, 288 
Gram polynomials, 290 
Gregory’s interpolation formulas 
(Newton’s interpolation formulas), 
95 
Gregory’s sum formula, 155 
Gross errors, 4 


Halley’s method, 412n. 

Hardy’s rule, 161 

Harmonic analysis, 369, 373, 389 

Hermite approximation, 277 

Hermite interpolation, 314 
trigonometric, 85(7) 

Hermite polynomials, 277 

Hermite quadrature, 317 

Hermite-Chebyshev quadrature, 365 (41) 

Hermite-Gauss quadrature, 327 

Heun’s method, 237 

Hitchcock’s iteration, 472 

Hutton summation, 160 


TIil-conditioned systems, 439 
Inequalities, sets of, 476 
Influence function, 164 
Inherent errors, 4 ) 
in linear algebraic equations, 436 ' 
check column, 438 ; 
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Inner product, 430 

Integral equations, 858, 412 

Integral operator, 120) 

Integration, numerical, 64, 71, 122(5), 
123(11, 18), 188, 142, 144, 160, 
176(10), 186(68, 64), 812-357, 
503-504 

(See also Error formulas) 

Interpolation, error formulas in, 45 

to halves, 103 

inverse, 50, 126(26), 411 
iterated, 49 

methods of, 502 
osculating, 316 
trigonometric, 85(7), 389 

Interpolation formulas (see specific 
formulas) 

Interpolation series, 114 

Inverse derivatives, 409 

Inverse interpolation, 50, 126(26), 411 

Inverse matrix, 426, 434 

Inverse operator, 129 

Inversion, of matrices, 434 

of series, 30(15), 34(39) 

Inverted differences, 397 

Iterated interpolation, 49 

Iterative process, order of, 448, 

479 (28-30) 


Jacobi polynomials, 282 
Jacobi-Gauss quadrature, 331 
Jacobian determinant, 451 


Kronecker delta, 61, 315 
Kutta’s methods, 236, 237 


Lagrange’s interpolation formula, 60 
Laguerre approximation, 274 
Laguerre polynomials, 275 
generalized, 282 
Laguerre-Chebyshey quadrature, 365(40) 
Laguerre-Gauss quadrature, 325 
generalized, 327 
Lanczos’ economization technique, 391 
Law of the mean, first, 22 
second, 22 
Least squares, over continuous intervals, 
258, 260 
over discrete ranges, 268, 261, 287 


Least squares, error in coefficients, 266 
observed errors, estimation of, 268 
principle of, 258 
weight in, 259, 265 

Legendre approximation, 272 

Legendre polynomials, 273, 387 

Legendre-Gauss quadrature, 323 

Leibnitz’ formula, 66 

Linear algebraic equations, 424-443 
homogeneous, 427 
inherent errors in, 436 
solvability of sets of, 427 

Linear interpolation, 35 

Lin’s iteration, 454, 469 
modified, 472 

Lobatto quadrature, 343 


Maclaurin’s integration formulas, 88(26) 
Madelung’s method, 255(41) 
Matrix, augmented, 424 

coefficient, 424 

inverse, 426 

inversion of, 434 

positive definite, 489 

rank of, 426 

transpose of, 441 
Mean value, 18 
Mean-value theorems, 22 
Mehler quadrature, 333 
Milne’s methods, 201, 218, 224 
Minor, 425 

principal, 489 
Modified differences, 113 
Modulus of precision, 18 
Monte Carlo methods, 249 
Moulton’s method, 201 


Neville’s method, 53 

Newton’s backward-difference formula, 
95 

Newton’s divided-difference formula, 43 

confluent forms, 55(17) 

Newton’s forward-difference formula, 95 

Newton’s rule, 74 

phony integration formulas, 71, 

9 

Newton-Raphson iteration, 447, 451, 453 

Noise level, 112 

Normal distribution, 17 

Normal equations, 260 
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Numerical differentiation, 64, 82, 122(5), 
123(11, 18), 134, 503 


Obrechkofi’s formulas, 231 
Odd-harmonic function, 130 
Operator, averaging, 128 
delta, 130 
difference, 128 
differential, 129 
integral, 129 
inverse, 129 
shifting, 129 
Order of iterative process, 448, 
479 (28-30) 
Orthogonal polynomials, 269, 287 
Orthogonality, 261 
Osculating interpolation, 316 


Parabolic rule, 75 

Parasitic solutions, 205 

Periodicities, determination of, 382 

Picard’s method, 194 

Positive definite matrix, 489 

Precision, degree of, 162, 318 
modulus of, 18 

Principal diagonal, 430 

Principal minor, 489 

Probable error, 20 

Prony’s method, 379 


Quadrature (see Integration; specific 
forms of quadrature) 


Radau quadrature, 338 

Random errors, 17 

Rank, 426 

Rational-function approximation, 399 

Rayleigh’s principle, 247 

Reciprocal, iterative evaluation of, 
481(38) 

Reciprocal derivatives, 409 

Reciprocal differences, 406 

Reduced cofactor, 426 

Reduced penultimate remainder, 472 

Reducibility, 399n. 

Region of determination in difference 
table, 45 

Regula falsi, 446 

Relative error, 9 

Relaxation, 440 
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Relaxation table, 441 

Residuals, 259, 396, 433, 440 
Richardson extrapolation, 78n., 238n. 
Rolle’s theorem, 22 

Root, iterative extraction of, 480(36, 37) 
Root mean square value, 19 
Root squaring, 462 

Round-off errors, 4 

Rounding, 8, 21 

Runge-Kutta methods, 233, 236 


Second law of the mean, 22 
Series, asymptotic, 8 
Biirmann, 25 
convergence tests for, 27(5, 6), 28(9) 
inversion of, 30(15), 34(39) 
summation of (see Summation of 
series) 
Taylor, 5, 23, 193 
Sheppard’s rules, 45n., 122(8) 
Shifting operator, 129 
Short-range stability, 207 
Significant figures, 9 
Simpson’s rule, 73, 141, 146 
Smoothing formulas, 295 
Sonine polynomials, 282 
Square root, iterative determination of, 
480(36, 37) 
Stability, asymptotic, 444 
short-range, 207 
Standard deviation, 18 
Steffensen’s error test, 27 (5) 
Steffensen’s interpolation formula, 105 — 
Stirling numbers, of first kind, 135 
of second kind, 138 
Stirling’s approximation to factorial, 80, 
181 (27) 
Stirling’s interpolation formula, 99 
Stérmer’s method, 223 
Subtabulation, 146, 176(16) 
Successive substitutions, method of, 448) 
450 ( 
Summation by parts, 286, 310(39) 
Summation of series, 27(7), 149, 155, 
180(25), 284, 504 
Symmetric functions, 356 
Synthetic division, 451, 468 
with quadratic divisor, 468 


Tables, checking, 111 
critical, 59(41) © 
of interpolation coefficients, 70, 118 





Taylor series, 5, 23, 108 

Thiele’s continued-fraction expansions, 
406 

Throwback, 112 

Transpose of matrix, 441 

Trapezoidal rule, 75, 155, 875n. 

Trigonometric approximation, 869, 373, 
382, 389 

Trigonometric interpolation, 85(7), 389 

Truncation errors, 5 

Tschebycheff (see Chebyshev) 
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Undetermined coefficients, method of, 
230 


Vandermonde’s determinant, 85(5) 
Variance, 20 


Weddle’s rule, 160 
Weierstrass’s theorem, 3n. 
Weight in least squares, 259, 265 








